Noise contrastive estimation

Also “negative sampling”.

April 22, 2020 — June 6, 2023

approximation
Bayes
Bregman
feature construction
likelihood free
machine learning
measure
metrics
probability
statistics
time series
Figure 1: Not quite sure what to do with this incredible and no-longer-appropriate-for-promotions band photo, but wow, what a time capsule.

Q: How does NCE work for continuous variates?

Q: How does NCE relate to likelihood ratio estimation?

Q: Is this an “energy based method”?

Q: relation to Bregman divergence? (M. U. Gutmann and Hirayama 2011)

2 LocalNCE / Binary NCE

3 InfoNCE

4 Use in ranking

TBC

5 References

Balestriero, Ibrahim, Sobal, et al. 2023. A Cookbook of Self-Supervised Learning.”
Chehab, Gramfort, and Hyvarinen. 2022. The Optimal Noise in Noise-Contrastive Learning Is Not What You Think.” arXiv:2203.01110 [Cs, Stat].
Gutmann, Michael U., and Hirayama. 2011. Bregman Divergence as General Framework to Estimate Unnormalized Statistical Models.” In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence. UAI’11.
Gutmann, Michael, and Hyvärinen. 2010. Noise-Contrastive Estimation: A New Estimation Principle for Unnormalized Statistical Models.” In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics.
Gutmann, Michael U., and Hyvärinen. 2012. Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics.” Journal of Machine Learning Research.
Hinton. 2002. Training Products of Experts by Minimizing Contrastive Divergence.” Neural Computation.
Le-Khac, Healy, and Smeaton. 2020. Contrastive Representation Learning: A Framework and Review.” IEEE Access.
Liu, Rosenfeld, Ravikumar, et al. 2021. Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation.” In.
Ma, and Collins. 2018. Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency.” arXiv:1809.01812 [Cs, Stat].
Miller, Weniger, and Forré. 2022. Contrastive Neural Ratio Estimation.” In.
Mnih, and Teh. 2012. “A Fast and Simple Algorithm for Training Neural Probabilistic Language Models.” In Proceedings of the 29th International Coference on International Conference on Machine Learning. ICML’12.
Oord, Li, and Vinyals. 2019. Representation Learning with Contrastive Predictive Coding.”
Saunshi, Ash, Goel, et al. 2022. Understanding Contrastive Learning Requires Incorporating Inductive Biases.” arXiv:2202.14037 [Cs].
Smith, and Eisner. 2005. Contrastive Estimation: Training Log-Linear Models on Unlabeled Data.” In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL ’05.
Zhang, Zhu, Song, et al. 2022. COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning.” In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
Zhu, Sun, and Koniusz. 2022. Contrastive Laplacian Eigenmaps.”