Great Papers in Statistics & Machine Learning
This is a list of statistics and machine learning research papers that I found to be interesting or are significant.
Most of these papers had great impact or discussed concepts that had great impact in the fields of statistics and machine learning. In fact, a lot of these papers are regarded as major historical advancements in their respective fields.
The number of citations for each entry is a rough guide, since they are only correct when I first added the entry. I might add brief thoughts and comments for individual papers over time.
I hope you have as much fun as I did exploring them!
Statistics
Classic Papers
-
Statistical Modeling: The Two Cultures - Leo Breiman - 2001 - 4,049 citations
-
Computer-Intensive Methods in Statistics - Persi Diaconis and Bradley Efron - 1983 - 1,581 citations
-
Stein’s Paradox in Statistics - Bradley Efron and Carl Morris - 1977 - 823 citations
Others
-
Surprised by the Gambler’s and Hot Hand Fallacies? - Joshua Miller and Adam Sanjurjo - 2015 - 151 citations
-
To Explain or to Predict? - Galit Shmueli - 2010 - 2,322 citations
Bootstrapping
-
The Jackknife, A Review - Rupert Miller - 1974 - 2,352 citations
-
Bootstrap Methods: Another Look at the Jackknife - Bradley Efron - 1979 - 22,433 citations
-
The Bayesian Bootstrap - Donald Rubin - 1,125 citations
Machine Learning
No Free Lunch Theorems
-
No Free Lunch Theorems for Optimization - David Wolpert and William Macready - 1997 - 11162 citations
-
Coevolutionary Free Lunches - David Wolpert and William Macready - 2005 - 261 citations
-
What is important about the No Free Lunch theorems? - David Wolpert - 2020 - 5 citations
-
A Conservation Law for Generalization Performance - Cullen Schaffer - 1994 - 519 citations
-
The Lack of A Priori Distinctions Between Learning Algorithms - David Wolpert - 1996 - 1,767 citations
-
No Free Lunch Theorem: A Review - Stavros et al - 2019 - 222 citations
DBSCAN
-
A Density-Based Algorithm for Discovering Clusters - Ester, Kriegel, Sander and Xu - 1996 - 23,405 citations
-
Why and How You Should (Still) Use DBSCAN - Schubert, Ester, Kriegel, Sander and Xu - 2015 - 890 citations
Naive Bayes Classifier
The Optimality of Naive Bayes - Harry Zhang - 2004 - 2,180 citations
Tackling the Poor Assumptions of Naive Bayes Text Classifiers - Rennie et al - 2003 - 1,445 citations
Estimating Continuous Distributions in Bayesian Classifiers - John and Langley - 1995 - 4,629 citations
Bias-Variance Trade-off
-
Reconciling modern machine-learning practice and the classical bias-variance trade-off
-
A Modern Take on the Bias-Variance Tradeoff in Neural Networks - Brady Neal et al - 2018 - 85 citations
-
Neural Networks and the Bias/Variance Dilemma - Geman, Bienenstock and Doursat - 1992 - 4,495 citations
Support Vector Machines
-
A training algorithm for optimal margin classifiers - Boser, Guyon and Vapnik - 1992 - 14,145 citations
-
Support-vector networks - Corinna Cortes and Vladimir Vapnik - 1995 - 51,803 citations
Word2vec
-
Word2vec Explained - Yoav Goldberg and Omer Levy - 2014 - 1,538 citations
-
Efficient Estimation of Word Representations in Vector Space - Mikolov et al - 2013 - 26,104 citations
Neural Networks
-
Who Invented the Reverse Mode of Differentiation? - Andreas Griewank - 2010 - 81 citations
-
Automatic Differentiation in Machine Learning: a Survey - Baydin et al - 2015 - 1,339 citations
-
Learning Representations by Back-Propagating Errors - Rumelhart et al - 1986 - 27,188 citations
Tree Algorithms, Bagging, Boosting
-
Fifty Years of Classification and Regression Trees - Wei Yin Loh - 2014 - 493 citations
-
Bagging Predictors - Leo Breiman - 1996 - 29,038 citations
-
Random Forests - Leo Breiman - 1996 - 84,549 citations
-
A Decision-Theoretic Generalization of On-Line Learning - Freund and Schapire - 1997 - 23,098 citations
-
A Short Introduction to Boosting - Freund and Schapire - 1999 - 4,273 citations
-
XGBoost: A Scalable Tree Boosting System - Chen and Guestrin - 2016 - 15,135 citations
Others
-
Very Simple Classification Rules Perform Well on Most Commonly Used Datasets - Robert Holte - 1993 - 2,538 citations
-
Top 10 Algorithms in Data Mining - X Wu et al - 2008 - 5,982 citations
-
MapReduce: Simplified Data Processing on Large Clusters - Dean and Ghemawat - 2008 - 22,857 citations
-
A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection - Ron Kohavi - 1995 - 14,423 citations