Distributed statistica inference

October 11, 2016 — October 11, 2016

computers are awful
concurrency hell
premature optimization

How do you design statistics that can be conducted over many nodes? Many algorithms factorise nicely over nodes. I might list some here.

If you wish to solve this with heterogeneous, untrustworthy or ad hoc nodes, as opposed to a nice orderly campus HPC cluster, then perhaps it would be better to think of this as swarm sensing.

Placeholder; I have nothing to say about this right now, although I should mention that message-passing algorithms based on variational inference and graphical models are one possible avenue. The most interesting to me is probably Gaussian belief propagation.

1 Tools



2 References

Acemoglu, Chernozhukov, and Yildiz. 2006. Learning and Disagreement in an Uncertain World.” Working Paper 12648.
Battey, Fan, Liu, et al. 2015. Distributed Estimation and Inference with Statistical Guarantees.” arXiv:1509.05457 [Math, Stat].
Bianchi, and Jakubowicz. 2013. Convergence of a Multi-Agent Projected Stochastic Gradient Algorithm for Non-Convex Optimization.” IEEE Transactions on Automatic Control.
Bieniawski, and Wolpert. 2004. Adaptive, Distributed Control of Constrained Multi-Agent Systems.” In Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems-Volume 3.
Bottou, Curtis, and Nocedal. 2016. Optimization Methods for Large-Scale Machine Learning.” arXiv:1606.04838 [Cs, Math, Stat].
Boyd. 2010. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers.
Calderhead. 2014. A General Construction for Parallelizing Metropolis−Hastings Algorithms.” Proceedings of the National Academy of Sciences.
Christ, Kempa-Liehr, and Feindt. 2016. Distributed and Parallel Time Series Feature Extraction for Industrial Big Data Applications.” arXiv:1610.07717 [Cs].
Gaines, and Zhou. 2016. Algorithms for Fitting the Constrained Lasso.” arXiv:1611.01511 [Stat].
Heinze, McWilliams, Meinshausen, et al. 2014. LOCO: Distributing Ridge Regression with Random Projections.” arXiv:1406.3469 [Stat].
Heinze, McWilliams, and Meinshausen. 2016. DUAL-LOCO: Distributing Statistical Estimation Using Random Projections.” In.
Jaggi, Smith, Takac, et al. 2014. Communication-Efficient Distributed Dual Coordinate Ascent.” In Advances in Neural Information Processing Systems 27.
Jain, Kakade, Kidambi, et al. 2016. Parallelizing Stochastic Approximation Through Mini-Batching and Tail-Averaging.” arXiv:1610.03774 [Cs, Stat].
Krummenacher, McWilliams, Kilcher, et al. 2016. Scalable Adaptive Stochastic Optimization Using Random Projections.” In Advances in Neural Information Processing Systems 29.
Ma, Konečnỳ, Jaggi, et al. 2015. Distributed Optimization with Arbitrary Local Solvers.” arXiv Preprint arXiv:1512.04039.
Mann, and Helbing. 2016. Minorities Report: Optimal Incentives for Collective Intelligence.” arXiv:1611.03899 [Cs, Math, Stat].
Ma, Smith, Jaggi, et al. 2015. Adding Vs. Averaging in Distributed Primal-Dual Optimization.” arXiv:1502.03508 [Cs].
Mateos, Bazerque, and Giannakis. 2010. Distributed Sparse Linear Regression.” IEEE Transactions on Signal Processing.
McLachlan, and Krishnan. 2008. The EM algorithm and extensions.
Nathan, and Klabjan. 2016. Optimization for Large-Scale Machine Learning with Distributed Features and Observations.” arXiv:1610.10060 [Cs, Stat].
Shalev-Shwartz, and Zhang. 2013. Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization.” Journal of Machine Learning Research.
Shamir, Srebro, and Zhang. 2014. Communication-Efficient Distributed Optimization Using an Approximate Newton-Type Method. In ICML.
Smith, Forte, Jordan, et al. 2015. L1-Regularized Distributed Optimization: A Communication-Efficient Primal-Dual Framework.” arXiv:1512.04011 [Cs].
Thanei, Heinze, and Meinshausen. 2017. Random Projections For Large-Scale Regression.” arXiv:1701.05325 [Math, Stat].
Trofimov, and Genkin. 2015. Distributed Coordinate Descent for L1-Regularized Logistic Regression.” In Analysis of Images, Social Networks and Texts. Communications in Computer and Information Science 542.
———. 2016. Distributed Coordinate Descent for Generalized Linear Models with Regularization.” arXiv:1611.02101 [Cs, Stat].
Yang. 2013. Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent.” In Advances in Neural Information Processing Systems.