# Noise contrastive estimation

Also “negative sampling”.

April 22, 2020 — June 6, 2023

approximation

Bayes

Bregman

feature construction

likelihood free

machine learning

measure

metrics

probability

statistics

time series

Q: How does NCE work for continuous variates?

Q: How does NCE relate to likelihood ratio estimation?

Q: Is *this* an “energy based method”?

Q: Is *this* adversarial”?

Q: relation to Bregman divergence? (M. U. Gutmann and Hirayama 2011)

## 1 recommended tutorials

## 2 LocalNCE / Binary NCE

## 3 InfoNCE

## 4 Use in ranking

TBC

## 5 References

Balestriero, Ibrahim, Sobal, et al. 2023. “A Cookbook of Self-Supervised Learning.”

Chehab, Gramfort, and Hyvarinen. 2022. “The Optimal Noise in Noise-Contrastive Learning Is Not What You Think.”

*arXiv:2203.01110 [Cs, Stat]*.
Gutmann, Michael U., and Hirayama. 2011. “Bregman Divergence as General Framework to Estimate Unnormalized Statistical Models.” In

*Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence*. UAI’11.
Gutmann, Michael, and Hyvärinen. 2010. “Noise-Contrastive Estimation: A New Estimation Principle for Unnormalized Statistical Models.” In

*Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics*.
Gutmann, Michael U., and Hyvärinen. 2012. “Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics.”

*Journal of Machine Learning Research*.
Hinton. 2002. “Training Products of Experts by Minimizing Contrastive Divergence.”

*Neural Computation*.
Le-Khac, Healy, and Smeaton. 2020. “Contrastive Representation Learning: A Framework and Review.”

*IEEE Access*.
Liu, Rosenfeld, Ravikumar, et al. 2021. “Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation.” In.

Ma, and Collins. 2018. “Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency.”

*arXiv:1809.01812 [Cs, Stat]*.
Miller, Weniger, and Forré. 2022. “Contrastive Neural Ratio Estimation.” In.

Mnih, and Teh. 2012. “A Fast and Simple Algorithm for Training Neural Probabilistic Language Models.” In

*Proceedings of the 29th International Coference on International Conference on Machine Learning*. ICML’12.
Oord, Li, and Vinyals. 2019. “Representation Learning with Contrastive Predictive Coding.”

Saunshi, Ash, Goel, et al. 2022. “Understanding Contrastive Learning Requires Incorporating Inductive Biases.”

*arXiv:2202.14037 [Cs]*.
Smith, and Eisner. 2005. “Contrastive Estimation: Training Log-Linear Models on Unlabeled Data.” In

*Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL ’05*.
Zhang, Zhu, Song, et al. 2022. “COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning.” In

*Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining*.
Zhu, Sun, and Koniusz. 2022. “Contrastive Laplacian Eigenmaps.”