density on Dan MacKinlay
https://danmackinlay.name/tags/density.html
Recent content in density on Dan MacKinlayHugo -- gohugo.ioen-usMon, 08 Mar 2021 18:07:53 +1100Reparameterization tricks in inference
https://danmackinlay.name/notebook/reparameterization_trick.html
Mon, 08 Mar 2021 18:07:53 +1100https://danmackinlay.name/notebook/reparameterization_trick.htmlFor variational autoencoders “Normalized” flows For density estimation Representational power of Tutorials References Approximating the desired distribution by perturbation of the available distribution
A trick in e.g. variational inference, especially autoencoders, for density estimation in probabilistic deep learning, best summarised as “fancy change of variables to that I can differentiate through the parameters of a distribution”. Connections to optimal transport and likelihood free inference in that this trick can enable some clever approximate-likelihood approaches.Differentiating through the Gamma
https://danmackinlay.name/notebook/gamma_diff.html
Thu, 15 Oct 2020 10:50:59 +1100https://danmackinlay.name/notebook/gamma_diff.htmlReferences Suppose I want to find a distributional gradient for a gamma process. Generalically I woudl find this via monte carlo gradient estimation.
Here is a problem-specific method:
I allow the latent random state to have more dimensions than a univariate. Let’s get specific. An example arises if we raid the random-variate-generation literature for transform methods to generate RNGs and differentiate A Gamma variate can be generated by a transformed normal and a uniform random variable,or two uniforms, depending on the parameter range.Monte Carlo gradient estimation
https://danmackinlay.name/notebook/mc_grad.html
Wed, 30 Sep 2020 10:59:22 +1000https://danmackinlay.name/notebook/mc_grad.htmlReferences Taking gradients through integrals.
See Mohamed et al. (2020) for a roundup.
https://github.com/deepmind/mc_gradients
A common activity for me at the moment is differentiating the integral - for example, through the inverse-CDF lookup.
You see, what I would really like is the derivative of the mass-preserving continuous map \(\phi_{\theta, \tau}\) such that
\[\mathsf{z}\sim F(\cdot;\theta) \Rightarrow \phi_{\theta, \tau}(\mathsf{z})\sim F(\cdot;\tau). \] Now suppose I wish to optimise or otherwise perturb \(\theta\).Monte Carlo optimisation
https://danmackinlay.name/notebook/mc_opt.html
Wed, 30 Sep 2020 10:59:22 +1000https://danmackinlay.name/notebook/mc_opt.htmlReferences Optimisation via Monte Carlo Simulation. Annealing and all that. TBD.
References Abernethy, Jacob, and Elad Hazan. 2016. “Faster Convex Optimization: Simulated Annealing with an Efficient Universal Barrier.” In International Conference on Machine Learning, 2520–28. PMLR. http://proceedings.mlr.press/v48/abernethy16.html. Botev, Zdravko I., and Dirk P. Kroese. 2008. “An Efficient Algorithm for Rare-Event Probability Estimation, Combinatorial Optimization, and Counting.” Methodology and Computing in Applied Probability 10 (4, 4): 471–505.Splitting simulation
https://danmackinlay.name/notebook/splitting_simulation.html
Mon, 28 Sep 2020 10:38:21 +1000https://danmackinlay.name/notebook/splitting_simulation.htmlReferences Splitting is a method for zooming in to the important region of an intractable probability distribution
I have just spent so much time writing about this that I had better pause for a while and leave this as a placeholder.
References Aalen, Odd O., Ørnulf Borgan, and S. Gjessing. 2008. Survival and Event History Analysis: A Process Point of View. Statistics for Biology and Health.Extreme value theory
https://danmackinlay.name/notebook/extreme_value_theory.html
Fri, 25 Sep 2020 16:25:18 +1000https://danmackinlay.name/notebook/extreme_value_theory.htmlGeneralized Pareto Distribution Generalized Extreme Value distributions Burr distribution References In a satisfying way, it turns out that there are only so many shapes that probability densities can assume as they head off toward infinity. Extreme value theory makes this notion precise, and gives us some tools to work with them.
See also densities and intensities, survival analysis.
🏗
Generalized Pareto Distribution Best intro from Hosking and Wallis (1987):Quantitative risk measurement
https://danmackinlay.name/notebook/qrm.html
Tue, 22 Sep 2020 10:10:13 +1000https://danmackinlay.name/notebook/qrm.htmlValue-at-Risk Expected shortfall Subadditivity/coherence G-expectation Sensitivity to parameters of risk measures Rosenblatt transform References Actuarial bread-and-butter. The mathematical study of measuring the chances of something terrible happening. This is usually a financial risk, but can also be extreme weather conditions, earthquakes, whatever. BTW, this is distinct from the “risk” in “statistical risk bounds”, which is the domain of statistical learning theory.
How do you evaluate how bad the worst cases are when deciding whether to do something?Mixture models for density estimation
https://danmackinlay.name/notebook/mixture_models.html
Fri, 24 Apr 2020 14:02:02 +1000https://danmackinlay.name/notebook/mixture_models.htmlMoments of a mixture Mixture zoo “Classic Mixtures” Continuous mixtures Bayesian Dirichlet mixtures Non-affine mixtures In Bayesian variational inference Estimation/selection methods (Local) maximum likelihood Method of moments Minimum distance Regression smoothing formulation Convergence and model selection Large sample results for mixtures Finite sample results for mixtures Sieve method Akaike Information criterion Quantization and coding theory Minimum description length/BIC Unsatisfactory thing: scale parameter selection theory Connection to Mercer kernel methods Miscellaney References pyxelate uses mixture models to create pixel art colour palettesSurvival analysis and reliability
https://danmackinlay.name/notebook/survival_analysis.html
Wed, 05 Feb 2020 14:08:55 +1100https://danmackinlay.name/notebook/survival_analysis.htmlEstimating survival rates Life table method Nelson-Aalen estimates Other reliability stuff tools References Estimating survival rates Here’s the set-up: looking at a data set of individuals’ lifespans you would like to infer the distributions—Analysing when people die, or things break etc. The statistical problem of estimating how long people’s lives are is complicated somewhat by the particular structure of the data — loosely, “every person dies at most one time”, and there are certain characteristic difficulties that arise, such as right-censorship.Density estimation
https://danmackinlay.name/notebook/density_estimation.html
Wed, 16 Oct 2019 09:37:51 +1100https://danmackinlay.name/notebook/density_estimation.htmlDivergence measures/contrasts Minimising Expected (or whatever) MISE Connection to point processes Spline/wavelet estimations Mixture models Gaussian processes Renormalizing flow models k-NN estimates Kernel density estimators Fancy ones References A statistical estimation problem where you are not trying to estimate a function of a distribution of random observations, but the distribution itself. In a sense, all of statistics implicitly does density estimation, but this is often instrumental in the course of discovering the some actual parameter of interest.The interpretation of densities as intensities and vice versa
https://danmackinlay.name/notebook/densities_and_intensities.html
Mon, 23 Sep 2019 16:53:57 +1000https://danmackinlay.name/notebook/densities_and_intensities.htmlBasis function method for density Intensities Basis function method for intensity Count regression Probability over boxes References Estimating densities by considering the observations drawn from that as a point process. In one dimension this gives us the particularly lovely trick of survival analysis, but the method is much more general, if not quite as nifty
Consider the problem of estimating the common density \(f(x)dx=dF(x)\) density of indexed i.Change of probability measure
https://danmackinlay.name/notebook/change_of_measure.html
Tue, 16 Aug 2016 14:57:33 +1000https://danmackinlay.name/notebook/change_of_measure.htmlReferences 🏗 A placeholder for notes on a.e. continuous monotonic changes of measure in order to render a process “simple” in some sense. Something something Martingale something blah blah stochastic calculus.
References Applebaum, David. 2009. Lévy Processes and Stochastic Calculus. 2nd ed. Cambridge Studies in Advanced Mathematics 116. Cambridge ; New York: Cambridge University Press. Barndorff-Nielsen, Ole E, and Albert Shiryaev. 2010. Change of Time and Change of Measure.Deconvolution
https://danmackinlay.name/notebook/deconvolution.html
Mon, 11 Apr 2016 16:07:42 +1000https://danmackinlay.name/notebook/deconvolution.htmlVanilla deconvolution Deconvolution method in statistics References I wish, for a project of my own, to know about how to deconvolve with
High dimensional data irregularly sampled data inhomogenous (although known) convolution kernels This is in a signal processing setting; for the (closely-related) kernel-density estimation in a statistical setting, see kernel approximation. If you don’t know your noise spectrum, see blind deconvolution.
Vanilla deconvolution Wiener filtering!Copula functions
https://danmackinlay.name/notebook/copula.html
Tue, 23 Jun 2015 18:28:30 +0200https://danmackinlay.name/notebook/copula.htmlElliptical Vine copulas References A neat way of quantifying arbitrary (?) dependence structures between random variables. Useful in, e.g. Quantitative Risk Management.
The trick is simple: Informally, you look at the marginal iCDF of each of \(n\) variables, and fiddle with the joint distribution of those marginals on \([0,1]^n\). (That’s assuming variables are absolutely continuous w.r.t some underlying measure space; distribution with atoms are trickier.)Elliptical distributions
https://danmackinlay.name/notebook/elliptical_distributions.html
Tue, 23 Jun 2015 18:28:30 +0200https://danmackinlay.name/notebook/elliptical_distributions.htmlReferences TBD
References Anderson, T. W. 2006. An Introduction to Multivariate Statistical Analysis. Hoboken, N.J.: Wiley-Interscience. Cambanis, Stamatis, Steel Huang, and Gordon Simons. 1981. “On the Theory of Elliptically Contoured Distributions.” Journal of Multivariate Analysis 11 (3): 368–85. https://doi.org/10.1016/0047-259X(81)90082-8. Chamberlain, Gary. 1983. “A Characterization of the Distributions That Imply Mean—Variance Utility Functions.” Journal of Economic Theory 29 (1): 185–201. https://doi.org/10.1016/0022-0531(83)90129-1. Culan, Christophe, and Claude Adnet.