Noise contrastive estimation

Also “negative sampling”.

2020-04-22 — 2023-06-06

approximation

Bayes

Bregman

feature construction

likelihood free

machine learning

metrics

probability

statistics

time series

Suspiciously similar content

Figure 1: Not quite sure what to do with this incredible and no-longer-appropriate-for-promotions band photo, but wow, what a time capsule.

Q: How does NCE work for continuous variates?

Q: How does NCE relate to likelihood ratio estimation?

Q: Is this an “energy based method”?

Q: Is this adversarial?

Q: relation to Bregman divergence? (M. U. Gutmann and Hirayama 2011)

1 recommended tutorials

2 LocalNCE / Binary NCE

Review: Learning Word Embeddings Efficiently with Noise-Contrastive Estimation (NCE) | by Sik-Ho Tsang

3 InfoNCE

InfoNCE Explained | Papers With Code

4 Use in ranking

TBC

5 References

Balestriero, Ibrahim, Sobal, et al. 2023. “A Cookbook of Self-Supervised Learning.”

Chehab, Gramfort, and Hyvarinen. 2022. “The Optimal Noise in Noise-Contrastive Learning Is Not What You Think.” arXiv:2203.01110 [Cs, Stat].

Gutmann, Michael U., and Hirayama. 2011. “Bregman Divergence as General Framework to Estimate Unnormalized Statistical Models.” In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence. UAI’11.

Gutmann, Michael, and Hyvärinen. 2010. “Noise-Contrastive Estimation: A New Estimation Principle for Unnormalized Statistical Models.” In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics.

Gutmann, Michael U., and Hyvärinen. 2012. “Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics.” Journal of Machine Learning Research.

Hinton. 2002. “Training Products of Experts by Minimizing Contrastive Divergence.” Neural Computation.

Le-Khac, Healy, and Smeaton. 2020. “Contrastive Representation Learning: A Framework and Review.” IEEE Access.

Liu, Rosenfeld, Ravikumar, et al. 2021. “Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation.” In.

Ma, and Collins. 2018. “Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency.” arXiv:1809.01812 [Cs, Stat].

Miller, Weniger, and Forré. 2022. “Contrastive Neural Ratio Estimation.” In.

Mnih, and Teh. 2012. “A Fast and Simple Algorithm for Training Neural Probabilistic Language Models.” In Proceedings of the 29th International Coference on International Conference on Machine Learning. ICML’12.

Oord, Li, and Vinyals. 2019. “Representation Learning with Contrastive Predictive Coding.”

Saunshi, Ash, Goel, et al. 2022. “Understanding Contrastive Learning Requires Incorporating Inductive Biases.” arXiv:2202.14037 [Cs].

Smith, and Eisner. 2005. “Contrastive Estimation: Training Log-Linear Models on Unlabeled Data.” In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL ’05.

Zhang, Zhu, Song, et al. 2022. “COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning.” In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.

Zhu, Sun, and Koniusz. 2022. “Contrastive Laplacian Eigenmaps.”