Ron Kohavi's publications

Applications of Data Mining to Electronic Commerce Comm ACM on data mining Handbook of Data Mining and Knowledge Discovery

My Ph.D. thesis Wrappers for Performance Enhancement and Oblivious Decision Graphs (or compressed postscript).


Publications are in reverse chronological order.

  1. Alex Deng, Ya Xu, Ron Kohavi, Toby Walker, Improving the Sensitivity of Online Controlled Experiments by Utilizing Pre-Experiment Data, WSDM 2013.
  2. Ron Kohavi, Alex Deng, Brian Frasca, Roger Longbotham, Toby Walker, Ya Xu, Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained, KDD 2012. Powerpoint slides, DOI.
  3. Ron Kohavi and Roger Longbotham, Unexpected Results in Online Controlled Experiments, SIGKDD 2010. DOI.
  4. Ron Kohavi, David Messner,Seth Eliot, Juan Lavista Ferres, Randy Henne, Vignesh Kannappan, and Justin Wang, Tracking Users' Clicks and Submits: Tradeoffs between User Experience and Data Loss, Microsoft White Paper, Oct 2010. Word.
  5. Ron Kohavi, Roger Longbotham, and Toby Walker, Online Experiments: Practical Lessons, IEEE Computer, Vol 43, issue 9, pp. 82-85, Sept 2010. DOI.
  6. Ron Kohavi, Thomas Crook, Roger Longbotham, Brian Frasca, Randy Henne, Juan Lavista Ferres, Tamir Melamed, Online Experimentation at Microsoft, Microsoft ThinkWeek paper recognized as top 30, 2009. (Modified version of workshop paper below.)
  7. Ron Kohavi, Thomas Crook, Roger Longbotham, Online Experimentation at Microsoft, Third workshop on Data Mining Case Studies and Practice Prize, 2009. The paper won 3rd place.
  8. Ron Kohavi, Roger Longbotham, Dan Sommerfield, and Randal M. Henne, Controlled Experiments on the Web: Survey and Practical Guide, Data Mining and Knowledge Discovery journal, Vol 18(1), p. 140-181, 2009. DOI.
  9. Thomas Crook, Brian Frasca, Ron Kohavi, and Roger Longbotham, Seven Pitfalls to Avoid when Running Controlled Experiments on the Web, KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, p. 1105-1114, 2009. DOI.
  10. Ron Kohavi and Roger Longbotham, Online Experiments: Lessons Learned, IEEE Computer, Vol 40, issue 9, p. 103-105, Sept 2007. DOI.
  11. Ron Kohavi, Randy Henne, and Dan Sommerfield, Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO, KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, p. 959-967, 2007. DOI.
  12. Ron Kohavi, Llew Mason, Rajesh Parekh, Zijian Zheng, Lessons and Challenges from Mining Retail E-Commerce Data, Machine Learning journal, Special Issue on Data Mining Lessons Learned, Vol 57, issue 1, p. 83-113, 2004. DOI.
  13. Ron Kohavi and Rajesh Parekh, Visualizing RFM Segmentation, Fourth SIAM International Conference on Data Mining (SDM), 2004.
  14. Ron Kohavi and Rajesh Parekh, Ten Supplementary Analyses to Improve E-commerce Web Sites (alt PDF), WEBKDD'2003.
  15. Blue Martini Case Studies:
  16. Ron Kohavi, Neal Rothleder, and Evangelos Simoudis, Emerging Trends in Business Analytics, Communications of the ACM, Evolving data mining into solutions for insights, Volume 45, Number 8, Aug 2002, pages 45-48. DOI.
  17. Ron Kohavi and J. Ross Quinlan., Decision-tree discovery, in Will Klosgen and Jan M. Zytkow, editors, Handbook of Data Mining and Knowledge Discovery, chapter 16.1.3, pages 267-276. Oxford University Press, 2002.
  18. Nir Friedman and Ron Kohavi, Bayesian classification, in Will Klosgen and Jan M. Zytkow, editors, Handbook of Data Mining and Knowledge Discovery, chapter 16.1.5, pages 282-288. Oxford University Press, 2002.
  19. Cliff Brunk and Ron Kohavi, Mineset, in Will Klosgen and Jan M. Zytkow, editors, Handbook of Data Mining and Knowledge Discovery, chapter 24.2.4, pages 584-589. Oxford University Press, 2002.
  20. Ron Kohavi and Dan Sommerfield, MLC++, in Will Klosgen and Jan M. Zytkow, editors, Handbook of Data Mining and Knowledge Discovery, chapter 24.1.2, pages 548-553. Oxford University Press, 2002.
  21. Ron Kohavi, Brij Masand, Myra Spiliopoulou, and Jaideep Srivastava, WEBKDD 2001 - Mining Web Log Data Across All Customers Touch Points, Third International Workshop, San Francisco, CA, Aug 2001. Original papers available here.
  22. Llew Mason, Zijian Zheng, Ron Kohavi, Brian Frasca, eMetrics Study, Dec 2001. This was an extensive study to generate a set of eMetrics using Blue Martini customers' transactional, customer, and clickstream data.
  23. Zijian Zheng, Ron Kohavi, and Llew Mason, Real World Performance of Association Rule Algorithms, KDD 2001: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 401-406, 2001, long version, and slides. DOI. The datasets (bms-pos, bms-webview-2) and bms-webview-1.
  24. Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng, Integrating E-Commerce and Data Mining: Architecture and Challenges, IEEE International Conference on Data Mining (ICDM'01), p. 27, 2001. DOI
  25. Ron Kohavi and Foster Provost, Applications of Data Mining to Electronic Commerce, Data Mining and Knowledge Discovery journal 5(1/2), p. 5-10, 2001. DOI.
    This special issue is also available as a hardcover book: Applications of Data Mining to Electronic Commerce.
  26. Ron Kohavi, Carla Brodley, Brian Frasca, Llew Mason, and Zijian Zheng, KDD-Cup 2000 Organizers' Report: Peeling the Onion, SIGKDD Explorations Volume 2, issue 2, p. 86-93, 2000. Also translated to Japanese in Information Processing Society of Japan, Vol 42 No. 5. DOI.
  27. Myra Spiliopoulou, Jaideep Srivastava, Ron Kohavi, and Brij Masand, Web Mining, Data Mining and Knowledge Discovery journal vol 6, p 5-8, 2002. DOI. Initially appeared as WEBKDD 2000 - Web Mining for E-Commerce in SIGKDD Explorations Volume 2, issue 2, 2000.
  28. Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng, Integrating E-commerce and Data Mining: Architecture and Challenges, WEBKDD'2000 workshop on Web Mining for E-Commerce - Challenges and Opportunities, Aug 2000. arXiv.
  29. Ron Kohavi and Mehran Sahami (co-chairs), Jim Bozik, Dorian Pyle, Rob Gerritsen, Steve Belcher, Ken Ono (panelists). Integrating Data Mining into Vertical Solutions: Problems and Challenges (slides), KDD-99 panel. The article KDD-99 Panel Report: Data Mining into Vertical Solutions appeared in SIGKDD Explorations Volume 1, issue 2
  30. Eric Bauer and Ron Kohavi, An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants, Machine Learning journal, Vol 36, Nos. 1/2, pages 105-139, 1999. DOI. The paper is cited over 400 times according to CiteSeerX and over 1,200 times in Google Scholar.
  31. Ron Kohavi and George John, The Wrapper Approach, book chapter in Feature Extraction, Construction and Selection : A Data Mining Perspective, edited by Huan Liu and Hiroshi Motoda.
  32. Ron Kohavi, Improving Accuracy by voting Classification Algorithms: Boosting, Bagging, and Variants. Invited talk at Workshop on Computation-Intensive Machine Learning Techniques. Australia, Sept 1998 compressed postscript slides
  33. Ron Kohavi and Foster Provost, Glossary of Terms. Editorial for the Special Issue on Applications of Machine Learning and the Knowledge Discovery Process (volume 30, Number 2/3, February/March 1998). Postscript or HTML
  34. Ron Kohavi and Foster Provost, On Applied Research in Machine Learning. Editorial for the Special Issue on Applications of Machine Learning and the Knowledge Discovery Process (volume 30, Number 2/3, February/March 1998). Postscript
  35. Ron Kohavi, Crossing the Chasm: From Academic Machine Learning to Commercial Data Mining. Invited talk at ICML-98. compressed postscript slides or acrobat (PDF) slides
  36. Afshin Goodarzi, Ron Kohavi, Richard Harmon, and Aydin Senkut, Loan Prepayment Modeling. Appeared in KDD-98 workshop on Data Mining in Finance. high-res compressed postscript or lower-res acrobat (PDF)
  37. Ron Kohavi, Data Mining with MineSet: What Worked, What Did Not, and What Might. Appeared in KDD-98 workshop on the Commercial Success of Data Mining. compressed postscript or acrobat (PDF)
  38. Ron Kohavi and Dan Sommerfield, Targeting Business Users with Decision Table Classifiers. Appeared in KDD-98. compressed postscript or acrobat (PDF)
  39. Ron Kohavi, Technique Selection in Machine Learning Applications. Invited talk at the ICML-98 workshop on the Methodology of Applying Machine Learning. compressed postscript slides or acrobat (PDF) slides
  40. Foster Provost, Tom Fawcett, Ron Kohavi, Building the Case Against Accuracy Estimation for Comparing Induction Algorithms. ICML-98. compressed postscript or (low-res) acrobat (PDF)
  41. Jeff Bradford, Clay Kunz, Ron Kohavi, Cliff Brunk, and Carla Brodley, Pruning Decision Trees with Misclassification Costs.  ECML-98. compressed postscript and long version in compressed postscript
  42. Ron Kohavi, Dan Sommerfield, and James Dougherty, Data Mining using MLC++, a Machine Learning Library in C++. International Journal of Artificial Intelligence Tools, Vol. 6, No. 4, 1997, p. 537-566. This is a longer version of the TAI'96 paper that received the IEEE Tools With Artificial Intelligence Best Paper Award. compressed postscript (283K) or acrobat (PDF)
  43. Barry Becker, Ron Kohavi, Dan Sommerfield, Visualizing the Simple Bayesian Classifier. Appears in the KDD 1997 Workshop on Issues in the Integration of Data Mining and Data Visualization. Lecture Notes in Computer Science by Springer Verlag. compressed postscript (358K).
  44. Cliff Brunk, James Kelly, and Ron Kohavi, MineSet: An Integrated System for Data Mining. Appears in the The Third International Conference on Knowledge Discovery and Data Mining, 1997. compressed postscript (276K).
  45. Ron Kohavi and Clayton Kunz, Option Decision Trees with Majority Votes. Apears in the International Conference on Machine Learning 1997. postscript (308K).
  46. Ron Kohavi and George John, Wrappers for Feature Subset Selection. In Artificial Intelligence journal, special issue on relevance, Vol. 97, Nos 1-2, pp. 273-324.NEC's ResearchIndex one of the top referenced paper in Machine Learning. PDF, postscript
  47. Ron Kohavi, Barry Becker, and Dan Sommerfield, Improving Simple Bayes compressed postscript. ECML-97 (poster).
  48. Ron Kohavi, Pat Langley, Yeogirl Yun, The Utility of Feature Weighting in Nearest-Neighbor Algorithms compressed postscript. ECML-97 (poster).
  49. Ron Kohavi, MLC++ Developments: Data Mining using MLC++. AAAI Fall Symposium on Learning Complex Behaviors in Adaptive Intelligent Systems, Nov 1996. compressed postscript slides.
  50. Ron Kohavi, Dan Sommerfield, and James Dougherty, Data Mining using MLC++, a Machine Learning Library in C++. TAI 96. The paper received the IEEE Tools With Artificial Intelligence Best Paper Award, 1996.  NEC's ResearchIndex one of the top referenced paper in Machine Learning.  Compressed postscript (245K) or uncompressed postscript (3.3MB)
  51. Ron Kohavi and Mehran Sahami, Error-Based and Entropy-Based Discretization of Continuous Features. KDD-96. postscript (165K)
  52. Ron Kohavi, Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid. KDD-96. compressed postscript (108K) or slides.
  53. Ron Kohavi, Book Review: Empirical Methods in Artificial Intelligence by Paul Cohen. International Journal of Neural Systems (IJNS), Vol 7, No 2, May 1996, p. 219-221. postscript. (50K) Note: final formatting in the journal was slightly different
  54. Ron Kohavi and David Wolpert, Bias Plus Variance Decomposition for Zero-One Loss Functions. ML96.  NEC's ResearchIndex one of the top referenced paper in Machine Learning. PDF or postscript (170K) or color slides for 2/7/96 talk (390K) (18 slides. ghostview doesn't work well on these. Use xpsview).
  55. Jerome Friedman, Ron Kohavi, and Yeogirl Yun, Lazy Decision Trees. AAAI-96, p. 717-724. postscript(145K) or slides.
  56. Ron Kohavi and Dan Sommerfield, Feature Subset Selection Using the Wrapper Model: Overfitting and Dynamic Search Space Topology. KDD-95. postscript (240K) or slides.
  57. Ron Kohavi and George John, Automatic Parameter Selection by Minimizing Estimated Error. ML-95. postscript (173K).
  58. James Dougherty, Ron Kohavi, and Mehran Sahami, Supervised and unsupervised discretization of continuous features. ML-95. postscript (213K) or slides.
  59. Ron Kohavi, A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. IJCAI-95. postscript (305K), PDF , or slides.
  60. Ron Kohavi and Chia-Hsin Li, Oblivious Decision Trees, Graphs, and Top-Down Pruning. IJCAI-95 postscript (171K).
  61. Ron Kohavi, The Power of Decision Tables. In the European Conference on Machine Learning, 1995. postscript (168K) or slides with some new results on discretization.
  62. Ron Kohavi and Brian Frasca, Useful feature subsets and rough set reducts. In the International Workshop on Rough Sets and Soft Computing (RSSC), 1994. postscript version (161K).
  63. Ron Kohavi, A third dimension to rough sets. In the International Workshop on Rough Sets and Soft Computing (RSSC), 1994. postscript version (163K).
  64. Ron Kohavi, Feature Subset Selection as Search with Probabilistic Estimates. In the AAAI Fall Symposium on Relevance, 1994. postscript version (126K).
  65. Ron Kohavi, George John, Richard Long, David Manley, and Karl Pfleger, MLC++:A Machine Learning Library in C++. In Tools with Artificial Intelligence, 1994. postscript version (118K).
  66. Ron Kohavi, Bottom-up induction of oblivious, read-once decision graphs : Strengths and limitations. In Twelfth National Conference on Artificial Intelligence, 1994. postscript version (199K).
  67. George John, Ron Kohavi, and Karl Pfleger, Irrelevant features and the subset selection problem. In Machine Learning: Proceedings of the Eleventh International Conference, 1994. Morgan Kaufmann. postscript (224K) or slides.
  68. Ron Kohavi, Bottom-up induction of oblivious, read-once decision graphs. In Proceedings of the European Conference on Machine Learning, 1994. postscript version (211K).
  69. Ron Kohavi and Scott Benson., Research note on decision lists. Journal of Machine Learning. 13(1), 1993
  70. Ron Kohavi and Yoav Shoham, Applications of datalog theories in AI. In AAAI-92 Workshop on Tractable Reasoning. 82-87

ronnyk@ live dot com