Great Papers in Statistics & Machine Learning
This is a list of statistics and machine learning research papers that I found to be interesting or significant.
Most of these papers had a significant impact or discussed concepts that did in the fields of statistics and machine learning. In fact, a lot of these papers are considered major historical advancements in their respective fields.
The number of citations for each entry is a rough guide, since they were only correct when I first added the entry. I might add brief thoughts and comments for individual papers over time.
I hope you have as much fun as I did exploring them!
Statistics
Classic Papers
-
Statistical Modeling: The Two Cultures - Leo Breiman - 2001 - 4,049 citations
-
Computer-Intensive Methods in Statistics - Persi Diaconis and Bradley Efron - 1983 - 1,581 citations
-
Stein’s Paradox in Statistics - Bradley Efron and Carl Morris - 1977 - 823 citations
Bootstrapping
-
The Jackknife, A Review - Rupert Miller - 1974 - 2,352 citations
-
Bootstrap Methods: Another Look at the Jackknife - Bradley Efron - 1979 - 22,433 citations
-
The Bayesian Bootstrap - Donald Rubin - 1,125 citations
Others
-
Surprised by the Gambler’s and Hot Hand Fallacies? - Joshua Miller and Adam Sanjurjo - 2015 - 151 citations
-
To Explain or to Predict? - Galit Shmueli - 2010 - 2,322 citations
Machine Learning
No Free Lunch Theorems
-
No Free Lunch Theorems for Optimization - David Wolpert and William Macready - 1997 - 11,162 citations
-
Coevolutionary Free Lunches - David Wolpert and William Macready - 2005 - 261 citations
-
What is important about the No Free Lunch theorems? - David Wolpert - 2020 - 5 citations
-
A Conservation Law for Generalization Performance - Cullen Schaffer - 1994 - 519 citations
-
The Lack of A Priori Distinctions Between Learning Algorithms - David Wolpert - 1996 - 1,767 citations
-
No Free Lunch Theorem: A Review - Stavros et al. - 2019 - 222 citations
DBSCAN
-
A Density-Based Algorithm for Discovering Clusters - Martin Ester et al. - 1996 - 23,405 citations
-
Why and How You Should (Still) Use DBSCAN - Erich Schubert et al. - 2015 - 890 citations
Naive Bayes Classifier
-
The Optimality of Naive Bayes - Harry Zhang - 2004 - 2,180 citations
-
Tackling the Poor Assumptions of Naive Bayes Text Classifiers - Rennie et al. - 2003 - 1,445 citations
-
Estimating Continuous Distributions in Bayesian Classifiers - John and Langley - 1995 - 4,629 citations
Bias-Variance Trade-off
-
Reconciling modern machine-learning practice and the classical bias-variance trade-off - Mikhail Belkin et al. - 2019 - 2,750 citations
-
A Modern Take on the Bias-Variance Tradeoff in Neural Networks - Brady Neal et al. - 2018 - 85 citations
-
Neural Networks and the Bias/Variance Dilemma - Stuart Geman et al. - 1992 - 4,495 citations
Support Vector Machines
-
A training algorithm for optimal margin classifiers - Boser, Guyon and Vapnik - 1992 - 14,145 citations
-
Support-vector networks - Corinna Cortes and Vladimir Vapnik - 1995 - 51,803 citations
Word2vec
-
Word2vec Explained - Yoav Goldberg and Omer Levy - 2014 - 1,538 citations
-
Efficient Estimation of Word Representations in Vector Space - Mikolov et al - 2013 - 26,104 citations
Neural Networks
-
Who Invented the Reverse Mode of Differentiation? - Andreas Griewank - 2010 - 81 citations
-
Automatic Differentiation in Machine Learning: a Survey - Baydin et al - 2015 - 1,339 citations
-
Learning Representations by Back-Propagating Errors - Rumelhart et al - 1986 - 27,188 citations
Tree Algorithms, Bagging, Boosting
-
Fifty Years of Classification and Regression Trees - Wei Yin Loh - 2014 - 493 citations
-
Bagging Predictors - Leo Breiman - 1996 - 29,038 citations
-
Random Forests - Leo Breiman - 1996 - 84,549 citations
-
A Decision-Theoretic Generalization of On-Line Learning - Freund and Schapire - 1997 - 23,098 citations
-
A Short Introduction to Boosting - Freund and Schapire - 1999 - 4,273 citations
-
XGBoost: A Scalable Tree Boosting System - Chen and Guestrin - 2016 - 15,135 citations
Others
-
Very Simple Classification Rules Perform Well on Most Commonly Used Datasets - Robert Holte - 1993 - 2,538 citations
-
Top 10 Algorithms in Data Mining - Wu et al - 2008 - 8,075 citations
-
MapReduce: Simplified Data Processing on Large Clusters - Dean and Ghemawat - 2008 - 22,857 citations
-
A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection - Ron Kohavi - 1995 - 14,423 citations