Neural nets with implicit layers

Also, declarative networks


A unifying framework for various networks, including neural ODEs, where our layers are not simple forward operations but who exacluation is represented as some optimisation problem.

For some info see the NeurIPS 2020 tutorial, Deep Implicit Layers - Neural ODEs, Deep Equilibirum Models, and Beyond, by Zico Kolter, David Duvenaud, and Matt Johnson.

NB: This is different to the implicit representation method. Since implicit layers and implicit representation layers also occur in the same problems (such as ML PDES this terminological confusion will haunt us.

Differentiable Convex Optimization Layers introduces cvxpylayers:

Optimization layers add domain-specific knowledge or learnable hard constraints to machine learning models. Many of these layers solve convex and constrained optimization problems of the form

\[ \begin{array}{rl} x^{\star}(\theta)=\operatorname{argmin}_{x} & f(x ; \theta) \\ \text { subject to } g(x ; \theta) & \leq 0 \\ h(x ; \theta) & =0 \end{array} \]

with parameters θ, objective f, and constraint functions g,h and do end-to-end learning through them with respect to θ.

In this tutorial we introduce our new library cvxpylayers for easily creating differentiable new convex optimization layers. This lets you express your layer with the CVXPY domain specific language as usual and then export the CVXPY object to an efficient batched and differentiable layer with a single line of code. This project turns every convex optimization problem expressed in CVXPY into a differentiable layer.

A different, although AFAICT equivalent, terminology is used Stephen Gould in Gould, Hartley, and Campbell (2019), under the banner of Deep Declarative Networks. Fun applications he highlight: robust losses in pooling layers, projection onto shapes, convex programming and warping, matching problems, (relaxed) graph alignment, noisy point-cloud surface reconstruction… (I am sitting in his seminar as I write this.) They have example code (pytorch). He relates some minimax-like optimisations to “Stackelberg games” which are an optimisation problem embedded in game theory.

That provokes certain other ideas: Learning basis decomposition, hyperparameter optimisation… That last one Stephen relates to this one by discussing both problems as “bi-level optimisation problems”.

Related: Deep equilibrium networks (Bai, Kolter, and Koltun 2019; Bai, Koltun, and Kolter 2020).

References

Agrawal, Akshay, Brandon Amos, Shane Barratt, Stephen Boyd, Steven Diamond, and Zico Kolter. 2019. “Differentiable Convex Optimization Layers.” In Advances In Neural Information Processing Systems. http://arxiv.org/abs/1910.12430.
Amos, Brandon, and J. Zico Kolter. 2017. OptNet: Differentiable Optimization as a Layer in Neural Networks,” March. https://arxiv.org/abs/1703.00443v4.
Amos, Brandon, Ivan Dario Jimenez Rodriguez, Jacob Sacks, Byron Boots, and J. Zico Kolter. 2018. “Differentiable MPC for End-to-End Planning and Control,” October. https://arxiv.org/abs/1810.13400v3.
Andersson, Joel A. E., Joris Gillis, Greg Horn, James B. Rawlings, and Moritz Diehl. 2019. CasADi: A Software Framework for Nonlinear Optimization and Optimal Control.” Mathematical Programming Computation 11 (1): 1–36. https://doi.org/10.1007/s12532-018-0139-4.
Arora, Sanjeev, Rong Ge, Tengyu Ma, and Ankur Moitra. 2015. “Simple, Efficient, and Neural Algorithms for Sparse Coding.” In Proceedings of The 28th Conference on Learning Theory, 40:113–49. Paris, France: PMLR. http://proceedings.mlr.press/v40/Arora15.html.
Bai, Shaojie, J Zico Kolter, and Vladlen Koltun. 2019. “Deep Equilibrium Models.” In Advances in Neural Information Processing Systems, 32:12. https://openreview.net/forum?id=S1eS4NBgLS.
Bai, Shaojie, Vladlen Koltun, and J. Zico Kolter. 2020. “Multiscale Deep Equilibrium Models.” In Advances in Neural Information Processing Systems. Vol. 33. https://proceedings.neurips.cc//paper/2020/hash/3812f9a59b634c2a9c574610eaba5bed-Abstract.html.
Barratt, Shane. 2018. “On the Differentiability of the Solution to Convex Optimization Problems,” April. https://arxiv.org/abs/1804.05098v3.
Border, KC. 2019. “Notes on the Implicit Function Theorem.” http://www.its.caltech.edu/~kcborder/Notes/IFT.pdf.
Djolonga, Josip, and Andreas Krause. 2017. “Differentiable Learning of Submodular Models.” In Proceedings of the 31st International Conference on Neural Information Processing Systems, 1014–24. NIPS’17. Red Hook, NY, USA: Curran Associates Inc. https://proceedings.neurips.cc/paper/2017/file/192fc044e74dffea144f9ac5dc9f3395-Paper.pdf.
Domke, Justin. 2012. “Generic Methods for Optimization-Based Modeling.” In International Conference on Artificial Intelligence and Statistics, 318–26. http://machinelearning.wustl.edu/mlpapers/paper_files/AISTATS2012_Domke12.pdf.
Donti, Priya L., Brandon Amos, and J. Zico Kolter. 2017. “Task-Based End-to-End Model Learning in Stochastic Optimization,” March. https://arxiv.org/abs/1703.04529v4.
G.Krantz, Steven, and Harold R.Parks. 2002. The Implicit Function Theorem. Springer.
Gould, Stephen, Basura Fernando, Anoop Cherian, Peter Anderson, Rodrigo Santa Cruz, and Edison Guo. 2016. “On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-Level Optimization,” July. https://arxiv.org/abs/1607.05447v2.
Gould, Stephen, Richard Hartley, and Dylan Campbell. 2019. “Deep Declarative Networks: A New Hope,” September. https://arxiv.org/abs/1909.04866v2.
Haber, Eldad, and Lars Ruthotto. 2018. “Stable Architectures for Deep Neural Networks.” Inverse Problems 34 (1): 014004. https://doi.org/10.1088/1361-6420/aa9a90.
Landry, Benoit, Joseph Lorenzetti, Zachary Manchester, and Marco Pavone. 2019. “Bilevel Optimization for Planning Through Contact: A Semidirect Method,” June. https://arxiv.org/abs/1906.04292v2.
Lee, Kwonjoon, Subhransu Maji, Avinash Ravichandran, and Stefano Soatto. 2019. “Meta-Learning with Differentiable Convex Optimization,” April. https://arxiv.org/abs/1904.03758v2.
Mena, Gonzalo, David Belanger, Scott Linderman, and Jasper Snoek. 2018. “Learning Latent Permutations with Gumbel-Sinkhorn Networks,” February. https://arxiv.org/abs/1802.08665v1.
Poli, Michael, Stefano Massaroli, Atsushi Yamashita, Hajime Asama, and Jinkyoo Park. 2020. “Hypersolvers: Toward Fast Continuous-Depth Models.” In Advances in Neural Information Processing Systems. Vol. 33. https://proceedings.neurips.cc//paper/2020/hash/f1686b4badcf28d33ed632036c7ab0b8-Abstract.html.
Rajeswaran, Aravind, Chelsea Finn, Sham Kakade, and Sergey Levine. 2019. “Meta-Learning with Implicit Gradients,” September. https://arxiv.org/abs/1909.04630v1.
Sulam, Jeremias, Aviad Aberdam, Amir Beck, and Michael Elad. 2020. “On Multi-Layer Basis Pursuit, Efficient Algorithms and Convolutional Neural Networks.” IEEE Transactions on Pattern Analysis and Machine Intelligence 42 (8): 1968–80. https://doi.org/10.1109/TPAMI.2019.2904255.
Wang, Po-Wei, Priya L. Donti, Bryan Wilder, and Zico Kolter. 2019. SATNet: Bridging Deep Learning and Logical Reasoning Using a Differentiable Satisfiability Solver,” May. https://arxiv.org/abs/1905.12149v1.

Warning! Experimental comments system! If is does not work for you, let me know via the contact form.

No comments yet!

GitHub-flavored Markdown & a sane subset of HTML is supported.