# Great Papers In Statistics and Machine Learning

This is a list of research papers that I found to be interesting.

Most of these papers had great impact, or discussed concepts that had great impact, in the fields of statistics and machine learning. In fact, a lot of these papers are regarded as major historical advancements in their respective fields.

The number of citations for each entry is a rough guide, since they are only correct when I first add the entry. Over time, I might add a few brief thoughts and comments on individual papers.

I hope you have as much fun as I did exploring them!

## Statistics

### Classic Papers

- Statistical Modeling: The Two Cultures - Leo Breiman - 2001 - 4,049 citations
- Computer-Intensive Methods in Statistics - Persi Diaconis and Bradley Efron - 1983 - 1,581 citations
- Stein's Paradox in Statistics - Bradley Efron and Carl Morris - 1977 - 823 citations

### Others

- Surprised by the Gambler's and Hot Hand Fallacies? - Joshua Miller and Adam Sanjurjo - 2015 - 151 citations
- To Explain or to Predict? - Galit Shmueli - 2010 - 2,322 citations

### Bootstrapping

- The Jackknife, A Review - Rupert Miller - 1974 - 2,352 citations
- Bootstrap Methods: Another Look at the Jackknife - Bradley Efron - 1979 - 22,433 citations
- The Bayesian Bootstrap - Donald Rubin - 1,125 citations

## Machine Learning

### No Free Lunch Theorems

- No Free Lunch Theorems for Optimization - David Wolpert and William Macready - 1997 - 11162 citations
- Coevolutionary Free Lunches - David Wolpert and William Macready - 2005 - 261 citations
- What is important about the No Free Lunch theorems? - David Wolpert - 2020 - 5 citations
- A Conservation Law for Generalization Performance - Cullen Schaffer - 1994 - 519 citations
- The Lack of A Priori Distinctions Between Learning Algorithms - David Wolpert - 1996 - 1,767 citations

### DBSCAN

- A Density-Based Algorithm for Discovering Clusters - Ester, Kriegel, Sander and Xu - 1996 - 23,405 citations
- Why and How You Should (Still) Use DBSCAN - Schubert, Ester, Kriegel, Sander and Xu - 2015 - 890 citations

### Naive Bayes Classifier

- The Optimality of Naive Bayes - Harry Zhang - 2004 - 2,180 citations
- Tackling the Poor Assumptions of Naive Bayes Text Classifiers - Rennie et al - 2003 - 1,445 citations
- Estimating Continuous Distributions in Bayesian Classifiers - John and Langley - 1995 - 4,283 citations

### Bias-Variance Trade-off

- Reconciling modern machine-learning practice and the classical bias-variance trade-off - Belkin, Hsu, Ma, and Mandal - 2019 - 699 citations
- A Modern Take on the Bias-Variance Tradeoff in Neural Networks - Brady Neal et al - 2018 - 85 citations
- Neural Networks and the Bias/Variance Dilemma - Geman, Bienenstock and Doursat - 1992 - 4,495 citations

### Support Vector Machines

- A training algorithm for optimal margin classifiers - Boser, Guyon and Vapnik - 1992 - 14,145 citations
- Support-vector networks - Corinna Cortes and Vladimir Vapnik - 1995 - 51,803 citations

### Word2vec

- Word2vec Explained - Yoav Goldberg and Omer Levy - 2014 - 1,538 citations
- Efficient Estimation of Word Representations in Vector Space - Mikolov et al - 2013 - 26,104 citations

### Neural Networks

- Who Invented the Reverse Mode of Differentiation? - Andreas Griewank - 2010 - 81 citations
- Automatic Differentiation in Machine Learning: a Survey - Baydin et al - 2015 - 1,339 citations
- Learning Representations by Back-Propagating Errors - Rumelhart et al - 1986 - 27,188 citations

### Tree Algorithms, Bagging, Boosting

- Fifty Years of Classification and Regression Trees - Wei Yin Loh - 2014 - 493 citations
- Bagging Predictors - Leo Breiman - 1996 - 29,038 citations
- Random Forests - Leo Breiman - 1996 - 84,549 citations
- A Decision-Theoretic Generalization of On-Line Learning - Freund and Schapire - 1997 - 23,098 citations
- A Short Introduction to Boosting - Freund and Schapire - 1999 - 4,273 citations
- XGBoost: A Scalable Tree Boosting System - Chen and Guestrin - 2016 - 15,135 citations

### Others

- Very Simple Classification Rules Perform Well on Most Commonly Used Datasets - Robert Holte - 1993 - 2,538 citations
- Top 10 Algorithms in Data Mining - X Wu et al - 2008 - 5,982 citations
- MapReduce: Simplified Data Processing on Large Clusters - Dean and Ghemawat - 2008 - 22,857 citations
- A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection - Ron Kohavi - 1995 - 14,423 citations