Multi-task ML



On training a model which predicts several things at once.

This is a very ML way of phrasing things. In classical statistics if we have fit some multivariate regressions by a likelihood-based procedure, then it produces multivariate output. No problem. However, in machine learning we frequently fit based on univariate predictive loss and it is not clear, or at least affordable, to translate these univariate predictions to multivariate ones without starting over and simply training lots of univariate prediction models. In that context, it is not foolish to ask about multivariate predictions and think of them of the task of developing a “multi-task model” as some kind of new thing.

Multiple objectives are dangerous

Optimising with for multiple objectives of unknown weights at once can be difficult. In the hyperpameter context Jonas Degrave and Ira Korshunova describe a solution in How we can make machine learning algorithms tunable Platt and Barr (1988).

Multi-task GPs

It is fairly natural to make a Gaussian process into a multivariate method; see Vector GP regression.

References

Bonilla, Edwin V., Kian Ming A. Chai, and Christopher K. I. Williams. 2007. “Multi-Task Gaussian Process Prediction.” In Proceedings of the 20th International Conference on Neural Information Processing Systems, 153–60. NIPS’07. USA: Curran Associates Inc. http://dl.acm.org/citation.cfm?id=2981562.2981582.
Caruana, Rich. 1998. “Multitask Learning.” In Learning to Learn, 95–133. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5529-2_5.
Dai, Ran, and Rina Foygel Barber. 2016. “The Knockoff Filter for FDR Control in Group-Sparse and Multitask Regression.” 2016. https://arxiv.org/abs/1602.03589.
Evgeniou, Theodoros, Charles A. Micchelli, and Massimiliano Pontil. 2005. “Learning Multiple Tasks with Kernel Methods.” Journal of Machine Learning Research 6: 615–37. http://www.jmlr.org/papers/v6/evgeniou05a.html.
Evgeniou, Theodoros, and Massimiliano Pontil. 2004. “Regularized Multi–Task Learning.” In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 109–17. KDD ’04. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/1014052.1014067.
Kong, Yuqing. 2019. “Dominantly Truthful Multi-Task Peer Prediction with a Constant Number of Tasks.” November 1, 2019. http://arxiv.org/abs/1911.00272.
Moreno-Muñoz, Pablo, Antonio Artés-Rodríguez, and Mauricio A. Álvarez. 2019. “Continual Multi-Task Gaussian Processes.” October 31, 2019. http://arxiv.org/abs/1911.00002.
Osborne, M. A., S. J. Roberts, A. Rogers, S. D. Ramchurn, and N. R. Jennings. 2008. “Towards Real-Time Information Processing of Sensor Network Data Using Computationally Efficient Multi-Output Gaussian Processes.” In 2008 International Conference on Information Processing in Sensor Networks (ipsn 2008), 109–20. https://doi.org/10.1109/IPSN.2008.25.
Platt, John C, and Alan H Barr. 1988. “Constrained Differential Optimization,” 10.
Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. “Language Models Are Unsupervised Multitask Learners,” 24.
Titsias, Michalis K., and Miguel Lázaro-Gredilla. 2011. “Spike and Slab Variational Inference for Multi-Task and Multiple Kernel Learning.” In Advances in Neural Information Processing Systems 24, edited by J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger, 2339–47. Curran Associates, Inc. http://papers.nips.cc/paper/4305-spike-and-slab-variational-inference-for-multi-task-and-multiple-kernel-learning.
Williams, Christopher, Stefan Klanke, Sethu Vijayakumar, and Kian M. Chai. 2009. “Multi-Task Gaussian Process Learning of Robot Inverse Dynamics.” In Advances in Neural Information Processing Systems 21, edited by D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, 265–72. Curran Associates, Inc. http://papers.nips.cc/paper/3385-multi-task-gaussian-process-learning-of-robot-inverse-dynamics.pdf.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.