Distributed statistica inference

October 11, 2016 — October 11, 2016

computers are awful
concurrency hell
distributed
optimization
premature optimization
statmech

How do you design statistics that can be conducted over many nodes? Many algorithms factorise nicely over nodes. I might list some here.

If you wish to solve this with heterogeneous, untrustworthy or ad hoc nodes, as opposed to a nice orderly campus HPC cluster, then perhaps it would be better to think of this as swarm sensing.

Placeholder; I have nothing to say about this right now, although I should mention that message-passing algorithms based on variational inference and graphical models are one possible avenue. The most interesting to me is probably Gaussian belief propagation.

1 Tools

Spark.

CoCOA.

2 References

Acemoglu, Chernozhukov, and Yildiz. 2006. Learning and Disagreement in an Uncertain World.” Working Paper 12648.
Battey, Fan, Liu, et al. 2015. Distributed Estimation and Inference with Statistical Guarantees.” arXiv:1509.05457 [Math, Stat].
Bianchi, and Jakubowicz. 2013. Convergence of a Multi-Agent Projected Stochastic Gradient Algorithm for Non-Convex Optimization.” IEEE Transactions on Automatic Control.
Bieniawski, and Wolpert. 2004. Adaptive, Distributed Control of Constrained Multi-Agent Systems.” In Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems-Volume 3.
Bottou, Curtis, and Nocedal. 2016. Optimization Methods for Large-Scale Machine Learning.” arXiv:1606.04838 [Cs, Math, Stat].
Boyd. 2010. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers.
Calderhead. 2014. A General Construction for Parallelizing Metropolis−Hastings Algorithms.” Proceedings of the National Academy of Sciences.
Christ, Kempa-Liehr, and Feindt. 2016. Distributed and Parallel Time Series Feature Extraction for Industrial Big Data Applications.” arXiv:1610.07717 [Cs].
Gaines, and Zhou. 2016. Algorithms for Fitting the Constrained Lasso.” arXiv:1611.01511 [Stat].
Heinze, McWilliams, Meinshausen, et al. 2014. LOCO: Distributing Ridge Regression with Random Projections.” arXiv:1406.3469 [Stat].
Heinze, McWilliams, and Meinshausen. 2016. DUAL-LOCO: Distributing Statistical Estimation Using Random Projections.” In.
Jaggi, Smith, Takac, et al. 2014. Communication-Efficient Distributed Dual Coordinate Ascent.” In Advances in Neural Information Processing Systems 27.
Jain, Kakade, Kidambi, et al. 2016. Parallelizing Stochastic Approximation Through Mini-Batching and Tail-Averaging.” arXiv:1610.03774 [Cs, Stat].
Krummenacher, McWilliams, Kilcher, et al. 2016. Scalable Adaptive Stochastic Optimization Using Random Projections.” In Advances in Neural Information Processing Systems 29.
Ma, Konečnỳ, Jaggi, et al. 2015. Distributed Optimization with Arbitrary Local Solvers.” arXiv Preprint arXiv:1512.04039.
Mann, and Helbing. 2016. Minorities Report: Optimal Incentives for Collective Intelligence.” arXiv:1611.03899 [Cs, Math, Stat].
Ma, Smith, Jaggi, et al. 2015. Adding Vs. Averaging in Distributed Primal-Dual Optimization.” arXiv:1502.03508 [Cs].
Mateos, Bazerque, and Giannakis. 2010. Distributed Sparse Linear Regression.” IEEE Transactions on Signal Processing.
McLachlan, and Krishnan. 2008. The EM algorithm and extensions.
Nathan, and Klabjan. 2016. Optimization for Large-Scale Machine Learning with Distributed Features and Observations.” arXiv:1610.10060 [Cs, Stat].
Shalev-Shwartz, and Zhang. 2013. Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization.” Journal of Machine Learning Research.
Shamir, Srebro, and Zhang. 2014. Communication-Efficient Distributed Optimization Using an Approximate Newton-Type Method. In ICML.
Smith, Forte, Jordan, et al. 2015. L1-Regularized Distributed Optimization: A Communication-Efficient Primal-Dual Framework.” arXiv:1512.04011 [Cs].
Thanei, Heinze, and Meinshausen. 2017. Random Projections For Large-Scale Regression.” arXiv:1701.05325 [Math, Stat].
Trofimov, and Genkin. 2015. Distributed Coordinate Descent for L1-Regularized Logistic Regression.” In Analysis of Images, Social Networks and Texts. Communications in Computer and Information Science 542.
———. 2016. Distributed Coordinate Descent for Generalized Linear Models with Regularization.” arXiv:1611.02101 [Cs, Stat].
Yang. 2013. Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent.” In Advances in Neural Information Processing Systems.