Recommender systems

Not my area, but I need a landing page to refer to for some layperson contacts of mine.

Users who liked apostasy also enjoyed

Recommender systems are a weird area. There have been some high profile examples of recommender models (e.g. the Netflix Prize) and they are clearly commercially important. At time of writing, though, there is relatively little material introducing them to wide audiences compared to say linear regressions and image classifiers and so on. I would like to refer to a simple and good introduction to the area, but I do not know one, so here is a scruffy word salad.

Starting from an overview by Javier we can find many approaches.

  1. Most Popular recommendations (the baseline)
  2. Item-User similarity based recommendations
  3. kNN Collaborative Filtering recommendations
  4. Gradient Boosting Machine based recommendations
  5. Non-Negative Matrix Factorization recommendations
  6. Factorization Machines (Rendle 2010)
  7. Field Aware Factorization Machines (Juan et al. 2016)
  8. Deep Learning based recommendations (Wide and Deep, Cheng et al. (2016))
  9. Neural Collaborative Filtering (He et al. 2017)
  10. Neural Graph Collaborative Filtering (Xiang Wang et al. 2019)
  11. Variational Autoencoders for Collaborative Filtering (Liang et al. 2018)

This is moving fast. I have the suspicion that the neural network methods are speciating at the moment. Also, filtering based on monitoring the brains of subjects is not mentioned (Davis III, Spapé, and Ruotsalo 2021).

Quentin Bacquet curtly summarises some methods and their performance on a problem, and deeply introduces the VAE method via an implementation.

Ben Lorica in Questioning the Efficacy of Neural Recommendation Systems interviews Paolo Cremonesi and Maurizio Ferrari Dacrema, who are the authors of some recent review papers (Dacrema, Cremonesi, and Jannach 2019; Dacrema et al. 2021).

I am most familiar with the matrix factorization approaches, especially NNMF) but there are many, e.g. variational autoencoder approaches are en vogue.

(Cheng et al. 2016) feels overcomplicated, but at least it is documented, and comes with an implementation: jrzaurin/pytorch-widedeep: A flexible package to combine tabular data with text and images using Wide and Deep models in Pytorch. Note that the author of that package advises using gradient boosting machines to get this job done.

Wu et al. (2021) taxonomises recommendation systems using Graph NNs, which is another natural idea. Their list goes a long way beyond Xiang Wang et al. (2019).


Cremonesi and Dacrema maintain reproduction code for several popular algorithms which is probably pedagogically useful.

Microsoft Recommenders

This repository contains examples and best practices for building recommendation systems, provided as Jupyter notebooks. The examples detail our learnings on five key tasks:

  • Prepare Data: Preparing and loading data for each recommender algorithm
  • Model: Building models using various classical and deep learning recommender algorithms such as Alternating Least Squares (ALS) or eXtreme Deep Factorization Machines (xDeepFM).
  • Evaluate: Evaluating algorithms with offline metrics
  • Model Select and Optimize: Tuning and optimizing hyperparameters for recommender models
  • Operationalize: Operationalizing models in a production environment on Azure

Vowpal wabbit

Vowpal wabbit can do recommendation extremely efficiently, but the manual has historically been famously abstruse. Now it is merely gruff. There are two tutorials in the documentation:

There is also one in vowpal_wabbit_deep_dive.ipynb in Microsoft Recommenders.


Abernethy, Jacob, Francis Bach, Theodoros Evgeniou, and Jean-Philippe Vert. 2009. “A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization.” Journal of Machine Learning Research 10: 803–26.
Chen, Dawei, Lexing Xie, Aditya Krishna Menon, and Cheng Soon Ong. 2017. “Structured Recommendation.” June 27, 2017.
Cheng, Heng-Tze, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, et al. 2016. “Wide & Deep Learning for Recommender Systems.” June 24, 2016.
Dacrema, Maurizio Ferrari, Simone Boglio, Paolo Cremonesi, and Dietmar Jannach. 2021. “A Troubling Analysis of Reproducibility and Progress in Recommender Systems Research.” ACM Transactions on Information Systems 39 (2): 1–49.
Dacrema, Maurizio Ferrari, Paolo Cremonesi, and Dietmar Jannach. 2019. “Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches.” In Proceedings of the 13th ACM Conference on Recommender Systems, 101–9. Copenhagen Denmark: ACM.
Davis III, Keith M., Michiel Spapé, and Tuukka Ruotsalo. 2021. “Collaborative Filtering with Preferences Inferred from Brain Signals.” In Proceedings of the Web Conference 2021, 602–11. WWW ’21. New York, NY, USA: Association for Computing Machinery.
He, Xiangnan, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. “Neural Collaborative Filtering.” In Proceedings of the 26th International Conference on World Wide Web, 173–82. WWW ’17. Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee.
Heckerman, David, David Maxwell Chickering, Christopher Meek, Robert Rounthwaite, and Carl Kadie. 2000. “Dependency Networks for Inference, Collaborative Filtering, and Data Visualization.” Journal of Machine Learning Research 1: 49–75.
Juan, Yuchin, Yong Zhuang, Wei-Sheng Chin, and Chih-Jen Lin. 2016. “Field-Aware Factorization Machines for CTR Prediction.” In Proceedings of the 10th ACM Conference on Recommender Systems, 43–50. RecSys ’16. New York, NY, USA: Association for Computing Machinery.
Koren, Yehuda, Robert Bell, and Chris Volinsky. 2009. “Matrix Factorization Techniques for Recommender Systems.” Computer 42 (8): 30–37.
Li, Lihong, Wei Chu, John Langford, and Xuanhui Wang. 2011. “Unbiased Offline Evaluation of Contextual-Bandit-Based News Article Recommendation Algorithms.” In Proceedings of the Fourth International Conference on Web Search and Web Data Mining (WSDM-11), 297–306.
Liang, Dawen, Rahul G. Krishnan, Matthew D. Hoffman, and Tony Jebara. 2018. “Variational Autoencoders for Collaborative Filtering.” In Proceedings of the 2018 World Wide Web Conference, 689–98. WWW ’18. Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee.
Rendle, Steffen. 2010. “Factorization Machines.” In 2010 IEEE International Conference on Data Mining, 995–1000.
Sedhain, Suvash, Aditya Krishna Menon, Scott Sanner, and Lexing Xie. 2015. AutoRec: Autoencoders Meet Collaborative Filtering.” In Proceedings of the 24th International Conference on World Wide Web, 111–12. Florence Italy: ACM.
Sharma, Amit, Jake M. Hofman, and Duncan J. Watts. 2015. “Estimating the Causal Impact of Recommendation Systems from Observational Data.” Proceedings of the Sixteenth ACM Conference on Economics and Computation - EC ’15, 453–70.
Wang, Xiang, Xiangnan He, Meng Wang, Fuli Feng, and Tat-Seng Chua. 2019. “Neural Graph Collaborative Filtering.” In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 165–74. SIGIR’19. New York, NY, USA: Association for Computing Machinery.
Wang, Xinxi, and Ye Wang. 2014. “Improving Content-Based and Hybrid Music Recommendation Using Deep Learning.” In Proceedings of the 22Nd ACM International Conference on Multimedia, 627–36. MM ’14. New York, NY, USA: ACM.
Wu, Shiwen, Fei Sun, Wentao Zhang, and Bin Cui. 2021. “Graph Neural Networks in Recommender Systems: A Survey.” April 19, 2021.
Xia, Min, Xianwen Yu, Xiaoning Zhang, and Yang Cao. 2019. VAEGAN: A Collaborative Filtering Framework Based on Adversarial Variational Autoencoders,” 4206–12.
Yu, Hsiang-Fu, Cho-Jui Hsieh, Si Si, and Inderjit S. Dhillon. 2012. “Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems.” In IEEE International Conference of Data Mining, 765–74.
———. 2014. “Parallel Matrix Factorization for Recommender Systems.” Knowledge and Information Systems 41 (3): 793–819.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.