Ron Kohavi, PhD

ronnyk@live dot com
http://www.kohavi.com/resume.html
http://www.linkedin.com/in/ronnyk

Work Experience

10/2010-

Partner Architect (top 1%), Online Services Division, Microsoft (Redmond, WA).

2005-2010

General Manager, Experimentation Platform, Microsoft (Redmond, WA). Founded, managed, and architected the Experimentation Platform team to enable running and analyzing controlled experiments at Microsoft. Grew and headed Dev/Test/PM/Analyst/Ops/Support teams from scratch to about 50 top notch people, including six principal-level team members. The team built the highly scalable experimentation platform, provided analysis reports on experiments, and evangelized the idea of data-driven decision-making through talks, classes, and posters. Over 20 Microsoft properties, including the MSN Home Page, Office Online, and Xbox.com ran experiments using the platform. Multiple experiments had surprising results and significant ROI of millions of dollars. A paper describing the challenges with examples is available at http://exp-platform.com/expMicrosoft.aspx

2003-2005

Director, Data Mining and Personalization, Amazon.com (Seattle, WA).

Manage multiple teams and grew the organization from under 50 to over 90 people. Responsibilities included multiple “two-pizza” teams, such as Amazon’s personalization (two teams), ad automation (SEO/SEM), consumer behavior / data mining, site experimentation, and automated e-mail. Introduced several features estimated to be worth several hundred million dollars in incremental revenue.

2002-2003

Vice President, Business Intelligence, Blue Martini Software (San Mateo, CA, now Escalate).
Managed the Business Intelligence Sales Demo team and the Analytic Services team. Helped drive Business Intelligence sales and marketing activities, created success stories, and helped the professional services organization in complex implementations as a center of excellence.

1998-2002

Senior Director, Data Mining Applications, Blue Martini Software
Led the engineering team, including two managers. Responsible for the data collection, transfer (ETL), analysis, reporting, visualization, and campaign management in Blue Martini's products. Initially as Director, I designed, architected, and initially coded (in Java) the data mining, reporting, and transfer (ETL) modules.

1995-1998

Manager, MineSet, Silicon Graphics Inc. (Mountain View, CA)
Managed the MineSet data mining and visualization engineering team of 15.
Prior to managing the team, I managed the Analytical Data Mining group of five engineers that designed and coded the analytical (server-side) data mining engines of MineSet on top of MLC++.

Prior to managing the analytical team, I was an individual contributor and coded analytical data mining algorithms.

1993-1995

Project lead, MLC++, Stanford University.
Designed and coded the Machine Learning library in C++ for data mining, in parallel with my Ph.D. work.
Supervised four students, including one 50% research assistant dedicated for the project (paid by grants from NSF and ONR).
The library was adopted as the basis for the analytical engines in MineSet when I moved to SGI and later licensed by Blue Martini Software for $6M.
It is used for research work at several universities. See http://www.sgi.com/tech/mlc/ for details.

1991 (summer)

Programmer, Verification Group, IBM Research Center, Haifa, Israel.
Programmed a graphical user interface in C.

1985-1988

Manager (Lieutenant), Israeli Defense Forces, Israel.
Managed 12 people at a computer center in a large army base in Israel.
Prior to being the manager, I designed and coded systems.

1981-1984

Programmer, International Software, Tel-Aviv, Israel.
Developed a database application generator, IRIS, with three other people (during high school). The program was sold commercially, including sales to the Israeli Defense Forces.

Education

1991-1995

Stanford University, Stanford, CA
Ph.D. in Machine Learning, Computer Science.
Thesis: Wrappers for Performance Enhancement and Oblivious Decision Graphs
Thesis advisors: Jerome Friedman, Nils Nilsson, and Yoav Shoham.

1988-1991

Technion, Haifa, Israel.
B.A. in Computer Science, Summa Cum Laude.

 

Honors

2010

My h-index, a measure of productivity and impact of published work, is 38 according to Harzing. Hirsch, who proposed the metric, suggested that an h-index of 10-12 is considered a useful guideline for tenure decisions at major research universities; a value of about 18 could mean a full professorship; 15-20 could mean a fellowship in the American Physical Society.

Five of my articles are in the top 1,000 most cited articles. The article Wrappers for Feature Subset Selection is in the top 300 most-cited articles according to CiteSeerX and has over 1,400 citations according to ScienceDirect.

2009

Online Experimentation at Microsoft, 2009, recognized as top 30 Microsoft ThinkWeek paper and an early version of it won 3rd place at the Third workshop on Data Mining Case Studies and Practice Prize, 2009

1996

IEEE Tools With Artificial Intelligence Best Paper Award for the paper Data Mining using MLC++, a Machine Learning Library in C++ by Kohavi, Sommerfield, and Dougherty.

1992

Passed the Ph.D. Artificial Intelligence qualifying exam with distinction.

1989, 1990, 1991

Technion, President's award (top 5%) each year of BA degree.


Professional Activities

1.      Seven patents granted, several pending.

2.      General Chair, KDD 2004

3.      Scientific Advisor to Trusted Opinion, 2007-2008

4.      Member of Technical Advisory Board, mySimon, 1999-2000 (until they were bought by CNET)

5.      MLC++ documents: C++ coding standards, MLC++ coding standards, environment, and utilities

6.      Program committee member, Knowledge Discovery and Data Mining conference (KDD), 1997-2010

7.      Program committee member, International Conference on Machine Learning, 1997-2003

8.      Co-chair (with Jim Gray), Industrial Track, Knowledge Discovery and Data Mining (KDD), 1999

9.      Co-chair (with Carla Brodley), KDD-CUP 2000 (Aug 2000)

10.  Co-chair WEBKDD'2003, WEBKDD'2001, WEBKDD'2000

11.  Co-editor (with Foster Provost), special issue of the International Journal Data Mining and Knowledge Discovery on e-commerce and data mining. This special issue is also available as  book: Applications of Data Mining to Electronic Commerce

12.  Member of the editorial board, Data Mining and Knowledge Discovery journal,1997, 1998, 1999, 2000, 2001, 2002

13.  Co-Editor (with Foster Provost), special issue on applications of machine learning (Volume 30, 1998), journal of Machine Learning

14.  Member of the editorial board, journal of Machine Learning, 1997, 1998, 1999

 

Selected Invited Talks/Panels (reverse chronological order)

1.      Keynote at the Analytics Revolution, 2010, Online Controlled Experiments: Listening to the Customers, not to the HiPPO (PDF) (PPTX) (video)

2.      KDD 2009 Tutorial: Planning, Running, and Analyzing Controlled Experiments on the Web (part 1 PPTX) (part2 PPTX) (part3 PPTX)

3.      Emetrics 2007: Practical Guide to Controlled Experiments on the Web

4.      ACM first S.F. Data Mining SIG talk, 2006, Focus the Mining Beacon: Lessons and Challenges from the World of E-Commerce

5.      Emetrics 2004: Amazon's Data Mining and Personalization (June 2004)

6.      CSLI's Seminar on Computational Learning and Adaptation on Real-world Insights from Mining Retail E-Commerce Data, May 22, 2003

7.      Blue Martini Webinar 2003, Deriving Key Insights from Blue Martini Business Intelligence: Summary of key insights from using Business Intelligence against Debenhams and MEC sites. Approved by Debenhams and MEC.

8.      Etail CRM Summit 2002, Mining Customer Data (PDF slides)
The talk was heavily referenced in ComputerWorld (original)

9.      New York Times, 2002, Fine Tuning Customer Behavior.

10.  Invited paper and talk at KDD 2001 industrial track: Mining E-commerce Data, the Good, the Bad, and the Ugly (PDF paper), slides

11.  E-commerce and Clickstream Mining Tutorial, 2001, at the first SIAM International Conference on Data Mining

12.  Invited talk at the National Academy of Engineering US Frontiers of Engineers, 2000, Data Mining and Visualization. Available in book form ISBN: 0-309-07319-7

13.  Invited talk at ICML 1998, Crossing the Chasm: From Academic Machine Learning to Commercial Data Mining.

 

Selected Publications (reverse chronological order)

1.      Ron Kohavi, Roger Longbotham, and Toby Walker, Online Experiments: Practical Lessons, IEEE Computer, 2010.

2.      Ronny Kohavi, Thomas Crook, Roger Longbotham, Brian Frasca, Randy Henne, Juan Lavista Ferres, Tamir Melamed, Online Experimentation at Microsoft, 2009. Microsoft ThinkWeek paper recognized as top 30. An earlier version of Online Experimentation at Microsoft appeared in the Third workshop on Data Mining Case Studies and Practice Prize, 2009. The paper won 3rd place.

3.      Ron Kohavi, Roger Longbotham, Dan Sommerfield, and Randal M. Henne, Controlled Experiments on the Web: Survey and Practical Guide, Data Mining and Knowledge Discovery journal, 2009

4.      Ron Kohavi, Llew Mason, Rajesh Parekh, Zijian Zheng, Lessons and Challenges from Mining Retail E-Commerce Data, Machine Learning journal volume 57, p. 83-13, Special Issue on Data Mining Lessons Learned, 2004.

5.      Ron Kohavi, Neal Rothleder, and Evangelos Simoudis, Emerging Trends in Business Analytics, Communications of the ACM, Volume 45, Number 8, Aug 2002, pages 45-48.

6.      Ron Kohavi and J. Ross Quinlan. Decision Tree Discovery. In the Handbook of Data Mining and Knowledge Discovery, chapter 16.1.3, pages 267-276. Oxford University Press, 2002.

7.      Kohavi Ron, Brodley Carla, Frasca Brian, Mason Llew, and Zheng Zijian, KDD-Cup 2000 Organizers' Report: Peeling the Onion. SIGKDD Explorations Volume 2, issue 2, 2000. PowerPoint slides. Also translated to Japanese in Information Processing Society of Japan, Vol 42 No. 5

8.      Eric Bauer and Ron Kohavi. An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. The journal Machine Learning Vol 36, Nos. 1/2, July/August 1999, pages 105-139. The paper is cited over 400 times according to CiteSeerX and over 1,200 times in Google Scholar.

9.      Ron Kohavi and George John, Wrappers for Feature Subset Selection. Artificial Intelligence 97, 1997 (print version). The paper is cited over 650 times in CiteSeerX, making it a top 300 referenced paper. It has over 1,400 citations according to ScienceDirect.

10.  Ron Kohavi, Dan Sommerfield, and James Dougherty. Data Mining using MLC++, a Machine Learning Library in C++. International Journal on Artificial Intelligence Tools vol. 6, No. 4, 1997. The paper received the IEEE Tools with Artificial Intelligence Best Paper Award.

11.  Ron Kohavi and David Wolpert. Bias Plus Variance Decomposition for Zero-One Loss Functions. In Machine Learning: Proceedings of the Thirteenth International Conference, pages 275-283, July 1996. It has over 130 citations according to CiteSeerX and 300 citations according to Google Scholar.

12.  Ron Kohavi. Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid. In The Second International Conference on Knowledge Discovery and Data Mining, pages 202-207, August 1996. It has over 100 citations according to CiteSeerX and over 450 according to Google Scholar.

13.  Ron Kohavi, A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. IJCAI 1995. The paper is cited over 400 times according to CiteSeerX, and over 1,800 times according to Google Scholar.

14.  James Dougherty, Ron Kohavi, and Mehran Sahami, Supervised and Unsupervised Discretization of Continuous Features. Machine Learning 1995. The paper is cited over 300 times according to CiteSeerX, and over 1,100 times according to Google Scholar.