Ron Kohavi's publications
My Ph.D. thesis
Wrappers for Performance Enhancement and Oblivious
Decision Graphs (or compressed
postscript).
Publications are in reverse chronological order.
- Ron Kohavi, Diane Tang, Ya Xu
Trustworthy
Online Controlled Experiments: A Practical
Guide to A/B Testing (book).
- Ron Kohavi and Stefan Thomke
The Surprising Power of Online Experiments: Getting the most
out of A/B and other controlled tests,
Harvard Business Review Sept-Oct 2017.
- Ron Kohavi and Roger Longbotham
Online Controlled Experiments and A/B Tests,
Encyclopedia of Machine Learning and Data Mining 2017
edited by Claude Sammut and Geoff Webb.
Chapter is one of most loaded chapters.
- Pavel Dmitriev, Brian Frasca, Somit Gupta, Ron Kohavi, and
Garnet Vaz
Pitfalls of Long-Term Online Controlled Experiments,
2016 IEEE International Conference on Big Data.
- Ron Kohavi
Online Controlled Experiments: Lessons from Running A/B/n Tests
for 12 Years, KDD 2015.
- Ron Kohavi, Alex Deng, Roger Longbotham, and Ya Xu
Seven Rules of Thumb for Web Site Experimenters,
KDD 2014.
- Alex Deng, Ya Xu, Ron Kohavi, Toby Walker,
Improving the Sensitivity of Online Controlled Experiments
by Utilizing Pre-Experiment Data, WSDM 2013.
- Ron Kohavi, Alex Deng, Brian Frasca, Roger Longbotham, Toby
Walker, Ya Xu,
Trustworthy Online Controlled Experiments: Five Puzzling Outcomes
Explained, KDD
2012. Powerpoint slides,
DOI.
- Ron Kohavi and Roger
Longbotham,
Unexpected Results in Online Controlled Experiments, SIGKDD 2010.
DOI.
- Ron Kohavi, David Messner,Seth Eliot, Juan Lavista Ferres, Randy
Henne, Vignesh Kannappan, and Justin
Wang, Tracking Users' Clicks
and Submits: Tradeoffs between User Experience and Data Loss,
Microsoft White Paper, Oct 2010.
Word.
- Ron Kohavi, Roger Longbotham, and Toby Walker,
Online Experiments: Practical Lessons, IEEE Computer, Vol 43,
issue 9, pp. 82-85, Sept 2010. DOI.
- Ron Kohavi, Thomas Crook, Roger Longbotham, Brian Frasca,
Randy Henne, Juan Lavista Ferres, Tamir Melamed,
Online Experimentation at Microsoft, Microsoft ThinkWeek paper
recognized as top 30, 2009. (Modified version of workshop
paper below.)
- Ron Kohavi, Thomas Crook, Roger Longbotham, Online Experimentation at Microsoft,
Third workshop on Data Mining Case
Studies and Practice Prize, 2009. The paper won 3rd place.
- Ron Kohavi, Roger Longbotham, Dan Sommerfield, and Randal
M. Henne,
Controlled Experiments on the Web: Survey and Practical Guide,
Data Mining and Knowledge Discovery journal, Vol 18(1), p. 140-181,
2009. DOI.
- Thomas Crook, Brian Frasca, Ron Kohavi, and Roger Longbotham,
Seven Pitfalls to Avoid when Running Controlled Experiments on the Web,
KDD '09: Proceedings of the 15th ACM SIGKDD international conference on
Knowledge discovery and data mining, p. 1105-1114, 2009.
DOI.
- Ron Kohavi and Roger Longbotham,
Online Experiments: Lessons Learned, IEEE Computer, Vol 40,
issue 9, p. 103-105, Sept 2007. DOI.
- Ron Kohavi, Randy Henne, and Dan Sommerfield,
Practical Guide to Controlled Experiments on the Web:
Listen to Your Customers not to the HiPPO,
KDD '07: Proceedings of the 13th ACM SIGKDD international conference on
Knowledge discovery and data mining, p. 959-967, 2007.
DOI.
- Ron Kohavi, Llew Mason, Rajesh
Parekh, Zijian Zheng, Lessons and
Challenges from Mining Retail E-Commerce Data, Machine Learning
journal, Special Issue on Data Mining Lessons Learned, Vol 57, issue
1, p. 83-113, 2004.
DOI.
- Ron Kohavi and Rajesh Parekh,
Visualizing RFM Segmentation,
Fourth SIAM International Conference on Data Mining
(SDM), 2004.
- Ron Kohavi and Rajesh Parekh,
Ten Supplementary Analyses to
Improve E-commerce Web Sites
(alt PDF),
WEBKDD'2003.
- Blue Martini Case Studies:
- Ron Kohavi, Neal Rothleder, and Evangelos Simoudis,
Emerging Trends in Business Analytics, Communications of the ACM,
Evolving data mining into solutions for insights,
Volume 45, Number 8, Aug 2002, pages 45-48.
DOI.
- Ron Kohavi and J. Ross Quinlan.,
Decision-tree discovery, in Will Klosgen and Jan M. Zytkow,
editors,
Handbook of Data Mining and Knowledge Discovery,
chapter 16.1.3, pages 267-276. Oxford University Press, 2002.
- Nir Friedman and Ron Kohavi,
Bayesian classification, in Will Klosgen and Jan M. Zytkow, editors,
Handbook of Data Mining and Knowledge Discovery,
chapter 16.1.5, pages 282-288. Oxford University Press, 2002.
- Cliff Brunk and Ron Kohavi, Mineset, in Will Klosgen and Jan M. Zytkow, editors, Handbook of Data Mining and Knowledge Discovery, chapter
24.2.4, pages 584-589. Oxford University Press, 2002.
- Ron Kohavi and Dan Sommerfield,
MLC++, in Will Klosgen and Jan M. Zytkow,
editors, Handbook of Data Mining and Knowledge Discovery, chapter
24.1.2, pages 548-553. Oxford University Press, 2002.
- Ron Kohavi, Brij Masand, Myra Spiliopoulou, and
Jaideep Srivastava, WEBKDD
2001 - Mining Web Log Data Across All Customers Touch Points,
Third International Workshop, San Francisco, CA, Aug 2001.
Original papers available here.
- Llew Mason, Zijian Zheng, Ron Kohavi, Brian Frasca,
eMetrics Study, Dec 2001.
This was an extensive study to generate a set of eMetrics using Blue
Martini customers' transactional, customer, and clickstream
data.
- Zijian Zheng, Ron Kohavi, and Llew Mason,
Real World
Performance of Association Rule Algorithms, KDD 2001:
Proceedings of the seventh ACM SIGKDD international conference on
Knowledge discovery and data mining, pages 401-406, 2001,
long version, and slides.
DOI.
The datasets
(bms-pos, bms-webview-2) and bms-webview-1.
- Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng,
Integrating E-Commerce and Data Mining: Architecture and Challenges,
IEEE International Conference on Data Mining (ICDM'01), p. 27, 2001.
DOI
.
- Ron Kohavi and Foster Provost,
Applications of Data Mining to Electronic Commerce, Data
Mining and Knowledge Discovery journal 5(1/2), p. 5-10, 2001.
DOI.
This special issue is also available as a hardcover book:
Applications of Data Mining to Electronic Commerce.
- Ron Kohavi, Carla Brodley, Brian Frasca, Llew
Mason, and Zijian Zheng,
KDD-Cup 2000 Organizers' Report: Peeling the Onion,
SIGKDD Explorations Volume 2, issue 2, p. 86-93, 2000.
Also translated to Japanese in Information
Processing Society of Japan, Vol 42 No. 5.
DOI.
- Myra Spiliopoulou, Jaideep Srivastava, Ron
Kohavi, and Brij Masand,
Web Mining,
Data Mining and Knowledge Discovery journal vol 6, p 5-8, 2002.
DOI.
Initially appeared as
WEBKDD 2000 - Web Mining for E-Commerce
in SIGKDD Explorations Volume 2, issue 2, 2000.
- Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng,
Integrating E-commerce and Data Mining:
Architecture and Challenges, WEBKDD'2000 workshop on Web Mining for
E-Commerce - Challenges and Opportunities, Aug 2000.
arXiv.
- Ron Kohavi and Mehran Sahami (co-chairs), Jim Bozik,
Dorian Pyle, Rob Gerritsen, Steve Belcher, Ken Ono (panelists).
Integrating Data Mining
into Vertical Solutions: Problems and Challenges (slides),
KDD-99 panel. The article
KDD-99 Panel Report: Data Mining into Vertical Solutions
appeared in SIGKDD Explorations Volume 1, issue 2.
- Eric Bauer and Ron Kohavi,
An Empirical Comparison of
Voting Classification Algorithms: Bagging, Boosting, and Variants,
Machine Learning journal, Vol 36, Nos. 1/2, pages 105-139, 1999.
DOI.
The paper is cited over 400 times according to
CiteSeerX and over 1,200 times in Google Scholar.
- Ron Kohavi and George John,
The Wrapper Approach, book
chapter in Feature
Extraction, Construction and Selection : A Data Mining Perspective,
edited by Huan Liu and Hiroshi Motoda.
- Ron Kohavi, Improving Accuracy by voting
Classification Algorithms: Boosting, Bagging, and Variants. Invited
talk at Workshop on Computation-Intensive Machine Learning Techniques.
Australia, Sept 1998.
PDF slides or
compressed postscript slides.
- Ron Kohavi and Foster Provost, Glossary of Terms.
Editorial for the Special Issue on Applications of Machine Learning and
the Knowledge Discovery Process (volume 30, Number 2/3, February/March
1998). PDF or
Postscript
or HTML.
- Ron Kohavi and Foster Provost, On Applied Research in
Machine Learning. Editorial for the Special Issue on Applications of
Machine Learning and the Knowledge Discovery Process (volume 30, Number
2/3, February/March 1998).
PDF or
Postscript.
- Ron Kohavi, Crossing the Chasm: From Academic Machine
Learning to Commercial Data Mining. Invited talk at ICML-98. compressed postscript or PDF slides.
- Afshin Goodarzi, Ron Kohavi, Richard Harmon, and Aydin
Senkut, Loan Prepayment Modeling. Appeared in KDD-98 workshop on
Data Mining in
Finance. high-res compressed postscript or PDF.
- Ron Kohavi, Data Mining with MineSet: What Worked,
What Did Not, and What Might. Appeared in KDD-98 workshop on the
Commercial Success of Data Mining. compressed postscript or PDF.
- Ron Kohavi and Dan Sommerfield, Targeting Business
Users with Decision Table Classifiers. Appeared in KDD-98. compressed postscript or PDF.
- Ron Kohavi, Technique Selection in Machine Learning
Applications. Invited talk at the ICML-98 workshop on the Methodology
of Applying Machine Learning. compressed postscript slides or PDF slides.
- Foster Provost, Tom Fawcett, Ron Kohavi, Building the
Case Against Accuracy Estimation for Comparing Induction Algorithms.
ICML-98. compressed postscript or PDF.
- Jeff Bradford, Clay Kunz, Ron Kohavi, Cliff Brunk, and
Carla Brodley, Pruning Decision Trees with Misclassification
Costs. ECML-98.
compressed postscript or
PDF and
long version in compressed postscript
or long version in PDF.
- Ron Kohavi, Dan Sommerfield, and James Dougherty,
Data Mining using MLC++, a Machine Learning Library in C++.
International Journal of Artificial Intelligence Tools, Vol. 6, No. 4,
1997, p. 537-566. This is a longer version of the TAI'96 paper that
received the IEEE Tools With Artificial Intelligence Best Paper Award. compressed
postscript (283K) or PDF.
- Barry Becker, Ron Kohavi, Dan Sommerfield, Visualizing
the Simple Bayesian Classifier. Appears in the KDD 1997 Workshop on
Issues in the Integration of Data Mining and Data Visualization.
Lecture Notes
in Computer Science by Springer Verlag.
compressed postscript or
PDF.
- Cliff Brunk, James Kelly, and Ron Kohavi, MineSet:
An Integrated System for Data Mining. Appears in the The Third
International Conference on Knowledge Discovery and Data Mining, 1997.
compressed postscript or
PDF.
- Ron Kohavi and Clayton Kunz, Option Decision
Trees with Majority Votes. Apears in the International Conference on
Machine
Learning 1997.
postscript or
PDF.
- Ron Kohavi and George John, Wrappers for Feature
Subset Selection. In Artificial Intelligence journal,
special issue on relevance, Vol. 97, Nos 1-2, pp. 273-324.NEC's ResearchIndex
one of the top referenced paper in Machine Learning.
PDF or
postscript.
- Ron Kohavi, Barry Becker, and Dan Sommerfield,
Improving Simple Bayes
compressed postscript or
PDF. ECML-97 (poster).
- Ron Kohavi, Pat Langley, Yeogirl Yun, The Utility of
Feature Weighting in Nearest-Neighbor Algorithms
compressed postscript or
PDF. ECML-97 (poster).
- Ron Kohavi, MLC++ Developments: Data Mining using
MLC++. AAAI Fall Symposium on Learning Complex Behaviors in Adaptive
Intelligent Systems, Nov 1996.
compressed postscript slides or
PDF slides.
- Ron Kohavi, Dan Sommerfield, and James Dougherty,
Data Mining using MLC++, a Machine Learning Library in C++. TAI 96. The
paper received the IEEE Tools With Artificial Intelligence Best
Paper Award, 1996. NEC's ResearchIndex
one of the top referenced paper in Machine Learning.
Postscript or
PDF.
- Ron Kohavi and Mehran Sahami, Error-Based and
Entropy-Based Discretization of Continuous Features. KDD-96.
postscript or
PDF.
- Ron Kohavi, Scaling Up the Accuracy of Naive-Bayes
Classifiers: a Decision-Tree Hybrid. KDD-96.
compressed postscript or
PDF and
compressed postscript slides or
PDF slides.
- Ron Kohavi, Book Review: Empirical Methods in
Artificial Intelligence by Paul Cohen. International Journal of Neural
Systems (IJNS), Vol 7, No 2, May 1996, p. 219-221.
postscript or
PDF.
Note: final formatting in the journal was slightly different.
- Ron Kohavi and David Wolpert, Bias Plus Variance
Decomposition for Zero-One Loss Functions. ML96. ResearchIndex
one of the top referenced paper in Machine Learning.
postscript or
PDF and
compressed postscript slides or
PDF slides.
- Jerome Friedman, Ron Kohavi, and Yeogirl Yun, Lazy
Decision Trees. AAAI-96, p. 717-724.
postscript or
PDF and
compressed postscript slides or
PDF slides.
- Ron Kohavi and Dan Sommerfield, Feature Subset
Selection Using the Wrapper Model: Overfitting and Dynamic Search Space
Topology.
KDD-95.
postscript or PDF and
postscript slides or
PDF slides.
- Ron Kohavi and George John, Automatic Parameter
Selection by Minimizing Estimated Error. ML-95.
postscript or
PDF.
- James Dougherty, Ron Kohavi, and Mehran Sahami, Supervised
and unsupervised discretization of continuous features. ML-95.
postscript or
PDF and
postscript slides or
PDF slides.
- Ron Kohavi, A Study of Cross-Validation and Bootstrap
for Accuracy Estimation and Model Selection. IJCAI-95.
postscript or
PDF and
postscript slides or
PDF slides.
- Ron Kohavi and Chia-Hsin Li, Oblivious Decision
Trees, Graphs, and Top-Down Pruning. IJCAI-95
postscript or
PDF.
- Ron Kohavi, The Power of Decision Tables. In the European
Conference on Machine Learning, 1995.
postscript or
PDF and
postscript slides or
PDF slides
with some new results on discretization.
- Ron Kohavi and Brian Frasca, Useful feature subsets and rough
set reducts. In the International Workshop on Rough Sets and Soft
Computing (RSSC), 1994.
postscript or
PDF.
- Ron Kohavi, A third dimension to rough sets. In the International
Workshop on Rough Sets and Soft Computing (RSSC), 1994.
postscript or
PDF.
- Ron Kohavi, Feature Subset Selection as Search with
Probabilistic Estimates. In the AAAI Fall Symposium on Relevance,
1994.
postscript or
PDF.
- Ron Kohavi, George John, Richard Long, David Manley, and
Karl Pfleger, MLC++:A Machine Learning Library in C++. In Tools
with Artificial Intelligence, 1994.
postscript or
PDF.
- Ron Kohavi, Bottom-up induction of oblivious,
read-once decision graphs : Strengths and limitations. In Twelfth
National Conference on Artificial Intelligence, 1994.
postscript or
PDF.
- George John, Ron Kohavi, and Karl Pfleger, Irrelevant
features and the subset selection problem. In Machine Learning:
Proceedings
of the Eleventh International Conference, 1994. Morgan Kaufmann.
postscript or
PDF and
postscript slides or
PDF slides.
- Ron Kohavi, Bottom-up induction of oblivious,
read-once decision graphs. In Proceedings of the European
Conference on Machine Learning, 1994.
postscript or
PDF.
- Ron Kohavi and Scott Benson., Research note on
decision lists. Journal of Machine Learning. 13(1), 1993.
PDF on Springer, or
Local PDF.
- Ron Kohavi and Yoav Shoham, Applications of datalog
theories in AI. In AAI-92 Workshop on Tractable Reasoning.
82-87
ronnyk@ live dot com