count_data on Dan MacKinlayhttps://danmackinlay.name/tags/count_data.htmlRecent content in count_data on Dan MacKinlayHugo -- gohugo.ioen-usTue, 30 Aug 2022 09:38:51 +1000State space reconstructionhttps://danmackinlay.name/notebook/state_space_reconstruction.htmlTue, 30 Aug 2022 09:38:51 +1000https://danmackinlay.name/notebook/state_space_reconstruction.htmlSome stuff I saw that’s maybe related Stuff that I might actually use Incoming References Disclaimer: I know next to nothing about this.
But I think it’s something like: Looking at the data from a, possibly stochastic, dynamical system. and hoping to infer cool things about the kinds of hidden states it has, in some general sense, such as some measure of statistical of computational complexity, or how complicated or “large” the underlying state space, in some convenient representation, is.COVID-19 in practicehttps://danmackinlay.name/notebook/covid_19.htmlSun, 13 Mar 2022 12:48:30 +1100https://danmackinlay.name/notebook/covid_19.htmlEpidemiology of Testing dynamics Personal risk calculators Airborne transmission Evaluating societal cost Virulence and risks upon infection Death and life years Relative rate compared to normal life risk Long covid Comorbidities Omicron variant Transmission Modeliing airborne transmission Treating Vitamin D Simulating Current restrictions where I live Tracking Australia’s contagion Incoming References Epidemiology of See also epidemiology, contact tracing.
Testing dynamics Current public health advice boils down to a heuristic that the Rapid Antigen Tests (🐀) are a good proxy for actual contagiousness.Essays in stochastic processeshttps://danmackinlay.name/post/phd_thesis.htmlFri, 13 Aug 2021 14:27:30 +1100https://danmackinlay.name/post/phd_thesis.htmlOne of the grab-bag of topics in my PhD was audio style transfer
Apparently I never posted my own PhD thesis online? Here it is.
TODO: mention what horrors lie behind this trapdoor:
PDF Download here..Cascade modelshttps://danmackinlay.name/notebook/cascade_models.htmlWed, 04 Aug 2021 09:41:04 +1000https://danmackinlay.name/notebook/cascade_models.htmlLagrangian distributions Borel-Tanner distribution Poisson-Poisson Lagrangian General Lagrangian distribution References \(\newcommand{\rv}[1]{\mathsf{#1}}\)
Models for, loosely, the total population size arising from all generations the offspring of some progenitor.
Let us suppose that each individual \(i\) who catches a certain strain of influenza will go on to infect a further \(\rv{n}_i\sim F\) others. Assume the population is infinite, that no one catches influenza twice and that the number of transmission of the disease is distributed the same for everyone who catches it.Models for count datahttps://danmackinlay.name/notebook/count_models.htmlTue, 03 Aug 2021 08:47:50 +1000https://danmackinlay.name/notebook/count_models.htmlPoisson Negative Binomial Mean/dispersion parameterisation (Pólya) Geometric Mean parameterisation Lagrangian distributions Discrete Stable Zipf/Zeta models Yule-Simon Conway-Maxwell-Poisson Decomposability properties Stability Self-divisibility References I have data and/or predictions made up of non-negative integers \({\mathbb N }\cup\{0\}\). What probability distributions can I use to model it?
All the distributions I discuss here have support unbounded above. Bounded distributions (e.g. vanilla Binomial) are for some other time. The exception is the Bernoulli RV, a.Contact tracinghttps://danmackinlay.name/notebook/contact_tracing.htmlWed, 28 Jul 2021 11:04:09 +1000https://danmackinlay.name/notebook/contact_tracing.htmlOptimal contact tracing Contact tracing and privacy References Dr. Evans, 1917, How to keep well
An important leverage point in epidemiology.
Optimal contact tracing There are some extremely interesting papers about optimal tracing here Baker et al. (2021); Braunstein and Ingrosso (2016).
sibyl-team/sib: Sibilla sibyl-team/epidemic_mitigation Contact tracing and privacy Privacy-respecting computing approaches had a brief moment in the spotlight thanks to COVID-19, although I think we all gave up on privacy at some pointContagion processes and their statisticshttps://danmackinlay.name/notebook/contagion_processes.htmlThu, 15 Jul 2021 20:11:42 +1000https://danmackinlay.name/notebook/contagion_processes.htmlDirichlet Hawkes process Incoming References The spread of quantities of things - earthquakes/diseases/innovations/credit defaults/cat videos - between different georegions/populations/vertices/banks/variates. For internet content virality in particular, there is much more specialized analysis and particular data sets, so I recommend checking the richer models under media virality.
In my own internal taxonomy growth in a single scalar value I would model using branching processes. Here I am concerned with modelling contagion between different variates.Media viralityhttps://danmackinlay.name/notebook/media_virality.htmlThu, 15 Jul 2021 20:07:50 +1000https://danmackinlay.name/notebook/media_virality.htmlReferences Hey check out this tiktok
Contagion of ideas and opinions is particularly well studied in the case of media. I know a little about about this, thanks to my own masters thesis.
For a deep dive, why not excavate the references in the ANU Computational Media Group which does an excellent job in this realm?
References Achab, Massil, Emmanuel Bacry, Stéphane Gaïffas, Iacopo Mastromatteo, and Jean-Francois Muzy. 2017. “Uncovering Causality from Multivariate Hawkes Integrated Cumulants.Queueinghttps://danmackinlay.name/notebook/queueing.htmlMon, 06 Apr 2020 22:09:14 +1000https://danmackinlay.name/notebook/queueing.htmlReferences Not much to say right now, except that I always forget the name of the useful tool from queuing theory, Kingman’s approximation for waiting time.
\[ \mathbb {E} (W_{q})\approx \left({\frac {\rho }{1-\rho }}\right)\left({\frac {c_{a}^{2}+c_{s}^{2}}{2}}\right)\tau \] where τ is the mean service time (i.e. μ = 1/τ is the service rate), λ is the mean arrival rate, ρ = λ/μ is the utilization, \(c_a\) is the coefficient of variation for arrivals (that is the standard deviation of arrival times divided by the mean arrival time) and \(c_s\) is the coefficient of variation for service times.Epidemicshttps://danmackinlay.name/notebook/epidemics.htmlFri, 03 Apr 2020 12:19:15 +1100https://danmackinlay.name/notebook/epidemics.htmlModeling Ameliorations Contact tracing References Buy this from sam.
A grab-bag of links about disease spread in its filthy glory. I am particularly examine COVID-19, from necessity.
[Microbescope by David McCandless, Omid Kashan, Miriam Quick, Karl Webster, Dr Stephanie Starling]
The spread of diseases in populations. A nitty-gritty messy empirical application for those abstract contagion models.
Connection with global trade networks: Cosma Shalizi on Ebola and Mongol Modernity.
Modeling I used to know a little about agent-based behavioural epidemic simulation, but I am no longer in that field and do not regard myself as a practical expert.Branching processeshttps://danmackinlay.name/notebook/branching_processes.htmlFri, 07 Feb 2020 17:33:31 +1100https://danmackinlay.name/notebook/branching_processes.htmlTo learn We do not care about time Discrete index, discrete state, Markov: The Galton-Watson process Continuous index, discrete state: the Hawkes Process Continuous index, continuous state Parameter estimation Discrete index, continuous state Special issues for multivariate branching processes Classic data sets Implementations References A diverse class of stochastic models that I am mildly obsessed with, where over some index set (usually time, space or both) there are distributed births of some kind, and we count the total population.Hawkes processeshttps://danmackinlay.name/notebook/hawkes_processes.htmlSun, 22 Dec 2019 16:36:44 +1100https://danmackinlay.name/notebook/hawkes_processes.htmlTime-inhomogeneous extension Tools References An intersection of point processes and branching processes is the Hawkes process. The classic is the univariate linear Hawkes process. For now we’ll assume it to be indexed by time.
Recall the log likelihood of a generic point process, with occurrence times \(\{t_i\}.\)
\[ \begin{aligned} L_\theta(t_{1:N}) &:= -\int_0^T\lambda^*_\theta(t)dt + \int_0^T\log \lambda^*_\theta(t) dN_t\\ &= -\int_0^T\lambda^*_\theta(t)dt + \sum_{j} \log \lambda^*_\theta(t_j) \end{aligned} \]
\(\lambda^*(t)\) is shorthand for \(\lambda^*(t|\mathcal{F}_t)\), and we call this the intensity.Generalized Galton-Watson processeshttps://danmackinlay.name/notebook/discrete_hawkes.htmlFri, 11 Oct 2019 16:28:27 +1100https://danmackinlay.name/notebook/discrete_hawkes.htmlLong Memory Galton-Watson Autoregressive characterisation Estimation of parameters Influence kernels Endo-exo models References This needs a better intro, but the Galton-Watson process is the archetype here.
There are many standard expositions. Two good ones:
Gesine Reinert’s Introduction to Branching Processes: Parts 1 and 2.
Steven Lalley’s intro.
Working through some generalisations of the Galton-Watson process as an INAR process. That is, this is something like the Galton-Watson process, butGenerating functionshttps://danmackinlay.name/notebook/generating_functions.htmlMon, 19 Jun 2017 09:49:09 +0800https://danmackinlay.name/notebook/generating_functions.htmlReferences I don’t have much to say apart from a couple of links and references I need from time to time.
But nor should I; simply download Herbert Wilf’s pellucid free textbook for all the possible introduction you could need.
Rolf Bardeli walks through some beautiful generating function application.
Later I would like to include notes on graph theory and neat count RV tricks based on generating functions. You an rapidly get into weird complex analysis when you consider asymptotics of these guys and get to find out why Cauchy’s integral formula and Lagrange’s Inversion Theorem were in your textbook.Count time series modelshttps://danmackinlay.name/notebook/count_time_series.htmlWed, 09 Dec 2015 13:09:18 +0800https://danmackinlay.name/notebook/count_time_series.htmlMaximum processes Finite state Markov chains GLM-type autoregressive Linear branching-type and self-decomposable Queeing models Other References Statistical models for time series with discrete time index and discrete state index, i.e. lists of non-negative whole numbers with a causal ordering.
C&c symbolic dynamics, nonlinear time series wizardry, random fields, branching processes and Galton Watson processes for some important special cases. If there is no serial dependence, you might want unadorned count models.Sparse regression for inhomogeneous Hawkes processeshttps://danmackinlay.name/post/masters_thesis.htmlTue, 12 May 2015 12:42:05 +0200https://danmackinlay.name/post/masters_thesis.htmlSampling method for our insane social media dataset
I completed my Master’s Thesis in 2015 at the Swiss Federal Insitutde of Technology (ETHZ) under the supervision of Professors Sara va de Geer and Didier Sornette, and was awarded my MSc.
Keywords that the thesis combines:
sparse regression maximum likelihood contagion prcoesses count process inference social media models Hawkes process The novel part is a method of sparse regression and identification of branching processes under inhomogeneous conditions, which I will summarise and make available when I have time.