Ron Kohavi, PhD

ronnyk@live dot com

Work Experience


Distinguished Engineer, General Manager, Analysis and Experimentation, Application Services Group, Microsoft (Bellevue, WA).
Lead a team of about 70 Data Scientists, Developers, and Program Managers to accelerate innovation through trustworthy experimentation at Microsoft.


Partner Architect, Bing, Online Services Division, Microsoft (Bellevue, WA).
Enable trustworthy and scalable experimentation at Bing.  A good summary is available at


General Manager, Experimentation Platform, Microsoft (Redmond, WA). Founded, managed, and architected the Experimentation Platform team to enable running and analyzing controlled experiments at Microsoft. Grew and headed Dev/Test/PM/Analyst/Ops/Support teams from scratch to about 50 top notch people, including six principal-level team members. The team built the highly scalable experimentation platform, shipped three major versions, provided analysis reports on experiments, and evangelized the idea of data-driven decision-making through talks, classes, and posters. Over 20 Microsoft properties, including the MSN Home Page, Office Online, and ran experiments using the platform. Multiple experiments had surprising results and significant ROI of millions of dollars. A paper describing the challenges with examples is available at


Director, Data Mining and Personalization, (Seattle, WA).

Manage multiple teams and grew the organization from under 50 to over 90 people. Responsibilities included multiple “two-pizza” teams, such as Amazon’s personalization (two teams), ad automation (SEO/SEM), consumer behavior / data mining, site experimentation, and automated e-mail. Introduced several features estimated to be worth several hundred million dollars in incremental revenue.


Vice President, Business Intelligence, Blue Martini Software (San Mateo, CA).
Managed the Business Intelligence Sales Demo team and the Analytic Services team. Helped drive Business Intelligence sales and marketing activities, created success stories, and helped the professional services organization in complex implementations as a center of excellence.


Senior Director, Data Mining Applications, Blue Martini Software
Led the engineering team, including two managers. Responsible for the data collection, transfer (ETL), analysis, reporting, visualization, and campaign management in Blue Martini's products. Initially as Director, I designed, architected, and initially coded (in Java) the data mining, reporting, and transfer (ETL) modules.


Manager, MineSet, Silicon Graphics Inc. (Mountain View, CA)
Managed the MineSet data mining and visualization engineering team of 15.
Prior to managing the team, I managed the Analytical Data Mining group of five engineers that designed and coded the analytical (server-side) data mining engines of MineSet on top of MLC++.

Prior to managing the analytical team, I was an individual contributor and coded analytical data mining algorithms.


Project lead, MLC++, Stanford University.
Designed and coded the Machine Learning library in C++ for data mining, in parallel with my Ph.D. work.
Supervised four students, including one 50% research assistant dedicated for the project (paid by grants from NSF and ONR).
The library was adopted as the basis for the analytical engines in MineSet when I moved to SGI and later licensed by Blue Martini Software for $6M.
It is used for research work at several universities. See for details.

1991 (summer)

Programmer, Verification Group, IBM Research Center, Haifa, Israel.
Programmed a graphical user interface in C.


Manager (Lieutenant), Israeli Defense Forces, Israel.
Managed 12 people at a computer center in a large army base in Israel.
Prior to being the manager, I designed and coded systems.


Programmer, International Software, Tel-Aviv, Israel.
Developed a database application generator, IRIS, with three other people (during high school). The program was sold commercially, including sales to the Israeli Defense Forces.



Stanford University, Stanford, CA
Ph.D. in Machine Learning, Computer Science.
Thesis: Wrappers for Performance Enhancement and Oblivious Decision Graphs
Thesis advisors: Jerome Friedman, Nils Nilsson, and Yoav Shoham.


Technion, Haifa, Israel.
B.A. in Computer Science, Summa Cum Laude.




Distinguished Engineer at Microsoft


My papers have over 24,000 citations.  My h-index, a measure of productivity and impact of published work, is 44 according to Google Scholar. Hirsch, who proposed the metric, suggested that an h-index of 10-12 is considered a useful guideline for tenure decisions at major research universities; a value of about 18 could mean a full professorship; 15-20 could mean a fellowship in the American Physical Society.

Three of my articles are in the top 1,000 most cited articles. The article Wrappers for Feature Subset Selection is in the top 300 most-cited articles according to CiteSeerX.


Online Experimentation at Microsoft, 2009, recognized as top 30 Microsoft ThinkWeek paper and an early version of it won 3rd place at the Third workshop on Data Mining Case Studies and Practice Prize, 2009


IEEE Tools With Artificial Intelligence Best Paper Award for the paper Data Mining using MLC++, a Machine Learning Library in C++ by Kohavi, Sommerfield, and Dougherty.


Passed the Ph.D. Artificial Intelligence qualifying exam with distinction.

1989, 1990, 1991

Technion, President's award (top 5%) each year of BA degree.

Professional Activities

1.     Twelve patents granted.

2.     General Chair, KDD 2004

3.     Scientific Advisor to Trusted Opinion, 2007-2008

4.     Member of Technical Advisory Board, mySimon, 1999-2000 (until they were bought by CNET)

5.     MLC++ documents: C++ coding standards, MLC++ coding standards, environment, and utilities

6.     Program committee member, Knowledge Discovery and Data Mining conference (KDD), 1997-2014

7.     Program committee member, International Conference on Machine Learning, 1997-2003

8.     Co-chair (with Jim Gray), Industrial Track, Knowledge Discovery and Data Mining (KDD), 1999

9.     Co-chair (with Carla Brodley), KDD-CUP 2000 (Aug 2000)

10.  Co-chair WEBKDD'2003, WEBKDD'2001, WEBKDD'2000

11.  Co-editor (with Foster Provost), special issue of the International Journal Data Mining and Knowledge Discovery on e-commerce and data mining. This special issue is also available as  book: Applications of Data Mining to Electronic Commerce

12.  Member of the editorial board, Data Mining and Knowledge Discovery journal,1997, 1998, 1999, 2000, 2001, 2002

13.  Co-Editor (with Foster Provost), special issue on applications of machine learning (Volume 30, 1998), journal of Machine Learning

14.  Member of the editorial board, journal of Machine Learning, 1997, 1998, 1999


Selected Invited Talks/Panels (reverse chronological order)

1.     Online Controlled Experiments: Introduction, Learnings, and Humbling Statistics, Qcon conference (11/11/2013) (Video).

2.     Online Controlled Experiments: Introduction, Learnings, and Humbling Statistics, ACM Recommender Systems industry keynote, Sept 2012 (Powerpoint PPTX).

3.     Keynote at the Analytics Revolution, 2010, Online Controlled Experiments: Listening to the Customers, not to the HiPPO (PDF) (PPTX) (video)

4.     KDD 2009 Tutorial: Planning, Running, and Analyzing Controlled Experiments on the Web (part 1 PPTX) (part2 PPTX) (part3 PPTX)

5.     Emetrics 2007: Practical Guide to Controlled Experiments on the Web

6.     ACM first S.F. Data Mining SIG talk, 2006, Focus the Mining Beacon: Lessons and Challenges from the World of E-Commerce

7.     Emetrics 2004: Amazon's Data Mining and Personalization (June 2004)

8.     CSLI's Seminar on Computational Learning and Adaptation on Real-world Insights from Mining Retail E-Commerce Data, May 22, 2003

9.     Blue Martini Webinar 2003, Deriving Key Insights from Blue Martini Business Intelligence: Summary of key insights from using Business Intelligence against Debenhams and MEC sites. Approved by Debenhams and MEC.

10.  Etail CRM Summit 2002, Mining Customer Data (PDF slides)
The talk was heavily referenced in ComputerWorld (original)

11.  New York Times, 2002, Fine Tuning Customer Behavior.

12.  Invited paper and talk at KDD 2001 industrial track: Mining E-commerce Data, the Good, the Bad, and the Ugly (PDF paper), slides

13.  E-commerce and Clickstream Mining Tutorial, 2001, at the first SIAM International Conference on Data Mining

14.  Invited talk at the National Academy of Engineering US Frontiers of Engineers, 2000, Data Mining and Visualization. Available in book form ISBN: 0-309-07319-7

15.  Invited talk at ICML 1998, Crossing the Chasm: From Academic Machine Learning to Commercial Data Mining.


Selected Publications (reverse chronological order)

1.     Ron Kohavi, Alex Deng, Roger Longbotham, and Ya Xu, Seven Rules of Thumb for Web Site Experimenters, KDD 2014.

2.     Ron Kohavi, Alex Deng, Brian Frasca, Toby Walker, Ya Xu, Nils Pohlmann, Online Controlled Experiments at Large Scale, KDD 2013.

3.     Alex Deng, Ya Xu, Ron Kohavi, Toby Walker, Improving the Sensitivity of Online Controlled Experiments by Utilizing Pre-Experiment Data, WSDM 2013.

4.     Ron Kohavi, Alex Deng, Brian Frasca, Roger Longbotham, Toby Walker, Ya Xu, Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained, KDD 2012. Powerpoint slides, DOI.

5.     Ron Kohavi, Roger Longbotham, and Toby Walker, Online Experiments: Practical Lessons, IEEE Computer, 2010.

6.     Ronny Kohavi, Thomas Crook, Roger Longbotham, Brian Frasca, Randy Henne, Juan Lavista Ferres, Tamir Melamed, Online Experimentation at Microsoft, 2009. Microsoft ThinkWeek paper recognized as top 30. An earlier version of Online Experimentation at Microsoft appeared in the Third workshop on Data Mining Case Studies and Practice Prize, 2009. The paper won 3rd place.

7.     Ron Kohavi, Roger Longbotham, Dan Sommerfield, and Randal M. Henne, Controlled Experiments on the Web: Survey and Practical Guide, Data Mining and Knowledge Discovery journal, 2009

8.     Ron Kohavi, Llew Mason, Rajesh Parekh, Zijian Zheng, Lessons and Challenges from Mining Retail E-Commerce Data, Machine Learning journal volume 57, p. 83-13, Special Issue on Data Mining Lessons Learned, 2004.

9.     Ron Kohavi, Neal Rothleder, and Evangelos Simoudis, Emerging Trends in Business Analytics, Communications of the ACM, Volume 45, Number 8, Aug 2002, pages 45-48.

10.  Ron Kohavi and J. Ross Quinlan. Decision Tree Discovery. In the Handbook of Data Mining and Knowledge Discovery, chapter 16.1.3, pages 267-276. Oxford University Press, 2002.

11.  Kohavi Ron, Brodley Carla, Frasca Brian, Mason Llew, and Zheng Zijian, KDD-Cup 2000 Organizers' Report: Peeling the Onion. SIGKDD Explorations Volume 2, issue 2, 2000. PowerPoint slides. Also translated to Japanese in Information Processing Society of Japan, Vol 42 No. 5

12.  Eric Bauer and Ron Kohavi. An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. The journal Machine Learning Vol 36, Nos. 1/2, July/August 1999, pages 105-139. The paper is cited over 400 times according to CiteSeerX and over 1,200 times in Google Scholar.

13.  Ron Kohavi and George John, Wrappers for Feature Subset Selection. Artificial Intelligence 97, 1997 (print version). The paper is cited over 650 times in CiteSeerX, making it a top 300 referenced paper. It has over 1,400 citations according to ScienceDirect.

14.  Ron Kohavi, Dan Sommerfield, and James Dougherty. Data Mining using MLC++, a Machine Learning Library in C++. International Journal on Artificial Intelligence Tools vol. 6, No. 4, 1997. The paper received the IEEE Tools with Artificial Intelligence Best Paper Award.

15.  Ron Kohavi and David Wolpert. Bias Plus Variance Decomposition for Zero-One Loss Functions. In Machine Learning: Proceedings of the Thirteenth International Conference, pages 275-283, July 1996. It has over 130 citations according to CiteSeerX and 300 citations according to Google Scholar.

16.  Ron Kohavi. Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid. In The Second International Conference on Knowledge Discovery and Data Mining, pages 202-207, August 1996. It has over 100 citations according to CiteSeerX and over 450 according to Google Scholar.

17.  Ron Kohavi, A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. IJCAI 1995. The paper is cited over 400 times according to CiteSeerX, and over 1,800 times according to Google Scholar.

18.  James Dougherty, Ron Kohavi, and Mehran Sahami, Supervised and Unsupervised Discretization of Continuous Features. Machine Learning 1995. The paper is cited over 300 times according to CiteSeerX, and over 1,100 times according to Google Scholar.