calculus on Dan MacKinlayhttps://danmackinlay.name/tags/calculus.htmlRecent content in calculus on Dan MacKinlayHugo -- gohugo.ioen-usMon, 20 Dec 2021 08:39:49 +1100Monte Carlo gradient estimationhttps://danmackinlay.name/notebook/mc_grad.htmlMon, 20 Dec 2021 08:39:49 +1100https://danmackinlay.name/notebook/mc_grad.htmlGeneric Optimising Monte Carlo Parametric Tooling References Taking gradients through integrals using randomness.
Generic Could do with a decent intro. TBD.
For unifying overviews see (Mohamed et al. 2020; Schulman et al. 2015; van Krieken, Tomczak, and Teije 2021) and the storchastic docs.
Machine Learning Trick of the Day (5): Log Derivative Trick REINFORCE vs Reparameterization Trick – Syed Ashar Javed Optimising Monte Carlo Let us say I need to differentiate through a monte carlo algorithm to alter its parameters while holding the PRNG fixed.Jaxhttps://danmackinlay.name/notebook/jax.htmlMon, 29 Nov 2021 15:50:02 +1100https://danmackinlay.name/notebook/jax.htmlNeat uses Idioms Deep learning frameworks Stax Haiku objax Flax Probabilistic programming frameworks Numpyro Stheno graph networks Multi-GPU jax is a successor to classic python+numpy autograd. It includes various code optimisation, jit-compilations, differentiating and vectorizing.
So, a numerical library with certain high performance machine-learning affordances. Note, it is not a deep learning framework per se, but rather the producer species at lowest trophic level of a deep learning ecosystem.Fractional differential equationshttps://danmackinlay.name/notebook/fractional_de.htmlMon, 13 Sep 2021 11:51:21 +1000https://danmackinlay.name/notebook/fractional_de.htmlReferences Classically, (stochastic or deterministic) ODEs are “memoryless” in the sense that the current state (and not the history) of the system determines the future states/distribution of states. In the stochastic case, they are Markov.
One way you can destroy this locality/memorylessness is by using fractional derivatives in the formulation of the equation. These use the Laplace-transform representation to do something like differentiating to a non-integer order.Automatic differentiationhttps://danmackinlay.name/notebook/autodiff.htmlThu, 05 Aug 2021 14:19:04 +1000https://danmackinlay.name/notebook/autodiff.htmlWho invented backpropagation? Computational complexity Forward- versus reverse-mode Symbolic differentiation In implicit targets In ODEs Hessians in neural nets Software jax Pytorch Julia Tensorflow Aesara taichi Classic python autograd Micrograd Enzyme Theano Casadi ADOL ad ceres solver audi algopy References Gradient field in python
Getting your computer to tell you the gradient of a function, without resorting to finite difference approximation, or coding an analytic derivative by hand.Functional regressionhttps://danmackinlay.name/notebook/functional_data.htmlThu, 28 May 2020 11:17:20 +1000https://danmackinlay.name/notebook/functional_data.htmlRegression using curves Functional autoregression References Statistics where the samples are not just data but whole curves and manifolds, or subsamples from them. Function approximation meets statisticsm, especially in Karhunen-Loève expansion
Regression using curves To quote Jim Ramsay:
Functional data analysis, […] is about the analysis of information on curves or functions. For example, these twenty traces of the writing of “fda” are curves in two ways: first, as static traces on the page that you see after the writing is finished, and second, as two sets functions of time, one for the horizontal “X” coordinate, and the other for the vertical “Y” coordinate.Matrix calculushttps://danmackinlay.name/notebook/matrix_calculus.htmlTue, 19 May 2020 12:00:06 +1000https://danmackinlay.name/notebook/matrix_calculus.htmlMatrix differentials Indexed tensor calculus References We can generalise the high school calculus, which is about scalar functions of a scalar argument, in various ways, to handle matrix-valued functions or matrix-valued arguments. One could generalise this further, by to full tensor calculus. But it happens that specifically matrix/vector operations are at a useful point of complexity for lots of algorithms, kind of a MVP. (I usually want this for higher order gradient descent.Ordinary differential equationshttps://danmackinlay.name/notebook/odes.htmlThu, 28 Mar 2019 10:35:03 +0800https://danmackinlay.name/notebook/odes.htmlReferences Nothing here except a note about my favourite pragmatic intro to ODE analysis:
Homer Reid’s 18.305, Advanced Analytic Methods in Science and Engineering.
More to come, maybe. I might mention, say, the extensions to fractional DEs or stochastic DEs or partial differential equations or…
References Alder, Michael. 2004. An Introduction to Complex Analysis for Engineers. Arnold, V. I., and Richard A. Silverman. 1978.