Not my core area, but I need a landing page to refer to because this is coming up in conversations a lot at the moment.
Recommender systems are a weird area. There have been some high profile examples of recommender models going back a long time (e.g. the Netflix Prize) and they are clearly commercially important. At time of writing, though, there is relatively little material introducing these systems to wide audiences, compared to say linear regressions and image classifiers and so on. Possibly this is something about who owns the data to make these tools go, which is to say, there are relatively few open data sets for recommender systems?
I would like to refer to a simple and good introduction to the area, but I do not know one, so here is a scruffy word salad with some helpful links.
YouTube’s defence of their recommender system introduces in vague outline a lot of the features that make it go on the way to defending them from allegations of feeding extremism .
Starting from an overview by Javier we can find many approaches to more abstract recommender systems.
- Most Popular recommendations (the baseline)
- Item-User similarity based recommendations
- kNN Collaborative Filtering recommendations
- Gradient Boosting Machine based recommendations
- Non-Negative Matrix Factorization recommendations
- Factorization Machines (Rendle 2010)
- Field Aware Factorization Machines (Juan et al. 2016)
- Deep Learning based recommendations (Wide and Deep, Cheng et al. (2016))
- Neural Collaborative Filtering (He et al. 2017)
- Neural Graph Collaborative Filtering (Xiang Wang et al. 2019)
- Variational Autoencoders for Collaborative Filtering (Liang et al. 2018)
A different take is in Microsoft’s Best Practices on Recommendation Systems, which breaks things down more along the lines of packages and implementations; probably what you want if you are an implementer not a researcher.
This area is moving fast, especially recently. I have the suspicion that the neural network methods are speciating at the moment.
Ben Lorica in Questioning the Efficacy of Neural Recommendation Systems interviews Paolo Cremonesi and Maurizio Ferrari Dacrema, who are the authors of some recent review papers (Dacrema, Cremonesi, and Jannach 2019; Dacrema et al. 2021).
I am most familiar with the matrix factorization approaches (Koren, Bell, and Volinsky 2009; Yu et al. 2014, 2012; Lim and Teh 2007), especially NNMF) but there are many, e.g. variational autoencoder approaches are en vogue.
(Cheng et al. 2016) feels overcomplicated, but at least it is documented, and comes with an implementation: jrzaurin/pytorch-widedeep: A flexible package to combine tabular data with text and images using Wide and Deep models in Pytorch. Note that the author of that package advises using gradient boosting machines to get this job done.
Cremonesi and Dacrema maintain reproduction code for several popular algorithms which is probably pedagogically useful.
Microsoft Recommenders, already mentioned, includes basic wrappers for a lot of different tools, and also includes worked examples and deployment advice. Maybe start there.
Vowpal wabbit implements some classic recommendation systems extremely efficiently, but the manual has historically been famously abstruse. Now it is merely gruff. There are, I think(?), two tutorials in the documentation:
For seduction and addiction of humans
Popular choice. See clickbait bandits.