time_series on Dan MacKinlay
https://danmackinlay.name/tags/time_series.html
Recent content in time_series on Dan MacKinlayHugo -- gohugo.ioen-usTue, 13 Apr 2021 14:40:14 +0800Gaussian process regression
https://danmackinlay.name/notebook/gp_regression.html
Tue, 13 Apr 2021 14:40:14 +0800https://danmackinlay.name/notebook/gp_regression.htmlQuick intro Density estimation Kernels Using state filtering On lattice observations On manifolds By variational inference With inducing variables By variational inference with inducing variables With vector output Approximation with dropout For dimension reduction Readings Implementations Geostat Framework GPy Stheno GPyTorch GPFlow Misc python Stan AutoGP scikit-learn Misc julia MATLAB References Chi Feng’s GP regression demo.
Gaussian random fields are stochastic processes/fields with jointly Gaussian distributions of observations.Dynamical systems via Koopman operators
https://danmackinlay.name/notebook/koopmania.html
Fri, 09 Apr 2021 11:46:21 +0800https://danmackinlay.name/notebook/koopmania.htmlReferences NB: Koopman here is B.O. Koopman (Koopman 1931) not S.J. Koopman, who also works in dynamical systems.
I do not know how this works, but maybe this fragment of abstract will do for now (Budišić, Mohr, and Mezić 2012):
A majority of methods from dynamical system analysis, especially those in applied settings, rely on Poincaré’s geometric picture that focuses on “dynamics of states.Signatures of rough paths
https://danmackinlay.name/notebook/signature_rough_paths.html
Fri, 02 Apr 2021 08:22:53 +1100https://danmackinlay.name/notebook/signature_rough_paths.htmlReferences I am not sure yet. Some kind of encoding of signals which is somewhere between sampling theory, rough SDEs and integral transforms.
References Bonnier, Patric, Patrick Kidger, Imanol Perez Arribas, Cristopher Salvi, and Terry Lyons. 2019. “Deep Signature Transforms.” In Advances in Neural Information Processing Systems. Vol. 32. Curran Associates, Inc. http://arxiv.org/abs/1905.08494. Chevyrev, Ilya, and Andrey Kormilitzin. 2016. “A Primer on the Signature Method in Machine Learning.Orthonormal and unitary matrices
https://danmackinlay.name/notebook/orthonormal_matrices.html
Thu, 11 Mar 2021 13:59:33 +1100https://danmackinlay.name/notebook/orthonormal_matrices.htmlParametrising Take the QR decomposition Iterative normalising Householder reflections Givens rotation Parametric sub families Structured Higher rank References In which I think about parameterisations and implementations of finite dimensional energy-preserving operators, a.k.a. matrices. A particular nook in the the linear feedback process library, closely related to stability in linear dynamical systems, since every orthonormal matrix is the forward operator of an energy-preserving system, which is an edge case for certain natural types of stability.Convolutional subordinator processes
https://danmackinlay.name/notebook/subordinator_convolution.html
Mon, 08 Mar 2021 15:29:19 +1100https://danmackinlay.name/notebook/subordinator_convolution.htmlReferences Stochastic processes by convolution of noise with smoothing kernels, where the driving noise is a Lévy subordinator.
Why would we want this? One reason is that this gives us a way to create nonparametric distributions over measures.
References Barndorff-Nielsen, O. E., and J. Schmiegel. 2004. “Lévy-Based Spatial-Temporal Modelling, with Applications to Turbulence.” Russian Mathematical Surveys 59 (1): 65. https://doi.org/10.1070/RM2004v059n01ABEH000701. Çinlar, E. 1979. “On Increasing Continuous Processes.Convolutional Gaussian processes
https://danmackinlay.name/notebook/gp_convolution.html
Mon, 01 Mar 2021 17:08:51 +1100https://danmackinlay.name/notebook/gp_convolution.htmlConvolutions with respect to a non-stationary driving noise Varying convolutions with respect to a stationary white noise References Gaussian processes by convolution of noise with smoothing kernels, which is a kind of dual to defining them through covariances.
This is especially interesting because it can be made computationally convenient (we can enforce locality) and non-stationarity.
Convolutions with respect to a non-stationary driving noise H. K.Random fields as stochastic differential equations
https://danmackinlay.name/notebook/random_fields_as_sdes.html
Mon, 01 Mar 2021 17:08:40 +1100https://danmackinlay.name/notebook/random_fields_as_sdes.htmlCreating a stationary Markov SDE with desired covariance Convolution representations Covariance representation Input measures \(\mu\) is a hypercube \(\mu\) is the unit sphere \(\mu\) is an isotropic Gaussian Without stationarity via Green’s functions References \(\renewcommand{\var}{\operatorname{Var}} \renewcommand{\dd}{\mathrm{d}} \renewcommand{\pd}{\partial} \renewcommand{\sinc}{\operatorname{sinc}}\)
The representation of certain random fields, especially Gaussian random fields as stochastic differential equations. This is the engine that makes filtering Gaussian processes go, and is also a natural framing for probabilistic spectral analysis.Convolutional stochastic processes
https://danmackinlay.name/notebook/stochastic_convolution.html
Mon, 01 Mar 2021 16:13:24 +1100https://danmackinlay.name/notebook/stochastic_convolution.htmlReferences Stochastic processes generated by convolution of white noise with smoothing kernels, which is not unlike kernel density estimation where the “data” is random.
For now, I am mostly interested in certain special cases Gaussian process convolutionss and subordinator convolutions.
patrick-kidger/Deep-Signature-Transforms: Code for "Deep Signature Transforms" patrick-kidger/signatory: Differentiable computations of the signature and logsignature transforms, on both CPU and GPU. References Bolin, David.Multi-output Gaussian process regression
https://danmackinlay.name/notebook/gp_regression_vector.html
Tue, 23 Feb 2021 12:09:36 +1100https://danmackinlay.name/notebook/gp_regression_vector.htmlCo-regionalization Multi-task Multi Output Spectral Mixture Kernel References In which I discover for myself whether “multi-task” and “co-regionalized” approaches are different. Álvarez, Rosasco, and Lawrence (2012)
Overview from Invenia: Gaussian Processes: from one to many outputs
Co-regionalization [the] community has begun to turn its attention to covariance functions for multiple outputs. One of the paradigms that has been considered (Bonilla, Chai, and Williams 2007; Osborne et al.Voice transcriptions and speech recognition
https://danmackinlay.name/notebook/speech_transcription.html
Sat, 30 Jan 2021 11:10:49 +1100https://danmackinlay.name/notebook/speech_transcription.htmlDictation Transcribing recordings Automation The converse to voice fakes: generating text from speech. a.k. speech-to-text.
This is an older practice than I thought. Check out Volume 89 of Popular Science monthly: Lloyd Darling, The Marvelous Voice Typewriter for the state-of-the-art dictation machine of 1916 (PDF version).
Dictation Speaking as a realtime interactive textual input method. See following roundups of dictation apps to start:
Zapier dictation roundup the rather grimmer Linux-specific roundup.Stochastic partial differential equations
https://danmackinlay.name/notebook/spdes.html
Wed, 27 Jan 2021 12:42:41 +1100https://danmackinlay.name/notebook/spdes.htmlReferences Placeholder, for the multidimensional PDE version of SDEs.
This picture of ice floes on the Bering shelf looks like it might be some kinda stochastic PDE thing, right?
References Bolin, David, and Kristin Kirchner. 2020. “The Rational SPDE Approach for Gaussian Random Fields With General Smoothness.” Journal of Computational and Graphical Statistics 29 (2): 274–85. https://doi.org/10.1080/10618600.2019.1665537. Dalang, Robert C., Davar Khoshnevisan, and Firas Rassoul-Agha, eds.Feynman-Kac formulae
https://danmackinlay.name/notebook/feynman_kac.html
Wed, 27 Jan 2021 11:55:19 +1100https://danmackinlay.name/notebook/feynman_kac.htmlReferences There is a mathematically rich theory about particle filters work. The notoriously abstruse Del Moral (2004); Doucet, Freitas, and Gordon (2001) are universally commended for unifying and making consistent the diffusion processes and Feynman-Kac formulae and “propagation of chaos”. I will get around to them eventually, maybe?
References Cérou, F., P. Del Moral, T. Furon, and A. Guyader. 2011. “Sequential Monte Carlo for Rare Event Estimation.Nonparametrically learning dynamical systems
https://danmackinlay.name/notebook/nn_learning_dynamics.html
Tue, 08 Dec 2020 13:05:58 +1100https://danmackinlay.name/notebook/nn_learning_dynamics.htmlQuestions Tools References Learning stochastic differential equations. Related: Analysing a neural net itself as a dynamical system, which is not quite the same but crosses over. Variational state filters.
A deterministic version of this problem is what e.g. the famous Vector Institute Neural ODE paper (Chen et al. 2018) did. Author Duvenaud argues that in some ways the hype ran away with the Neural ODE paper, and credits CasADI with the innovations here.Multi-output Gaussian process regression
https://danmackinlay.name/notebook/gp_regression_functional.html
Mon, 07 Dec 2020 20:43:06 +1100https://danmackinlay.name/notebook/gp_regression_functional.htmlReferences In which I discov Learning operators via GPs.
References Brault, Romain, Florence d’Alché-Buc, and Markus Heinonen. 2016. “Random Fourier Features for Operator-Valued Kernels.” In Proceedings of The 8th Asian Conference on Machine Learning, 110–25. http://arxiv.org/abs/1605.02536. Brault, Romain, Néhémy Lim, and Florence d’Alché-Buc. n.d. “Scaling up Vector Autoregressive Models With Operator-Valued Random Fourier Features.” Accessed August 31, 2016. https://aaltd16.irisa.fr/files/2016/08/AALTD16_paper_11.pdf. Brouard, Céline, Marie Szafranski, and Florence D’Alché-Buc.Probabilistic spectral analysis
https://danmackinlay.name/notebook/probabilistic_spectral_analysis.html
Wed, 25 Nov 2020 11:33:34 +1100https://danmackinlay.name/notebook/probabilistic_spectral_analysis.htmlClassic: stochastic processes studied via correlation function Non-stationary spectral kernel Change point detection version Non-Gaussian approaches References Graphical introduction to nonstationary modelling of audio data. The input (bottom) is a sound recording of female speech. We seek to decompose the signal into Gaussian process carrier waveforms (blue block) multiplied by a spectrogram (green block). The spectrogram is learned from the data as a nonnegative matrix of weights times positive modulators (top).Hidden Markov Model inference for Gaussian Process regression
https://danmackinlay.name/notebook/gp_filtering.html
Wed, 25 Nov 2020 11:28:43 +1100https://danmackinlay.name/notebook/gp_filtering.htmlSpatio-temporal usage Miscellaneous notes towards implementation References Classic flavours together, Gaussian processes and state filters/ stochastic differential equations and random fields as stochastic differential equations.
I am interested here in the trick which makes certain Gaussian process regression problems soluble by making them local, i.e. Markov, with respect to some assumed hidden state, in the same way Kalman filtering does Wiener filtering. This means you get to solve a GP as an SDE.Observability and sensitivity in learning dynamical systems
https://danmackinlay.name/notebook/sensitivity.html
Mon, 09 Nov 2020 13:38:40 +1100https://danmackinlay.name/notebook/sensitivity.htmlReferences The contact between ergodic theorems and statistical identifiability. How precisely can I learn a given parameter of a dynamical system from observation? In ODE theory a useful concept is sensitivity analysis, which tells us how much gradient information our observations give us about a parameter. This comes in local (at my current estimate) and global (for all parameter ranges) flavours
In linear systems theory the term observability is used to discuss whether we can in fact identify a parameter or a latent state, which I will conflate for the current purposes.Path continuity of stochastic processes
https://danmackinlay.name/notebook/path_continuity.html
Tue, 27 Oct 2020 07:22:04 +1100https://danmackinlay.name/notebook/path_continuity.htmlKolmogorov continuity theorem. Stochastic continuity Continuity entailments SDEs with rough paths “Random DEs” References \[\renewcommand{\var}{\operatorname{Var}} \renewcommand{\dd}{\mathrm{d}} \renewcommand{\bb}[1]{\mathbb{#1}} \renewcommand{\vv}[1]{\boldsymbol{#1}} \renewcommand{\rv}[1]{\mathsf{#1}} \renewcommand{\gvn}{\mid} \renewcommand{\Ex}{\mathbb{E}} \renewcommand{\Pr}{\mathbb{P}}\]
“When are the paths of a stochastic process continuous?” is a question one might like to ask. But things are never so simple in stochastic process theory. Continuity is not unambiguous here. we need to ask more precise questions. If we are concerned about whether the paths sampled from the process are almost-surely continuous functions then we probably mean something like:Time
https://danmackinlay.name/notebook/time.html
Tue, 20 Oct 2020 13:36:11 +1100https://danmackinlay.name/notebook/time.html🏗
Zach Holman’s UTC is enough for everyone, right?, which is made even better by its reading list:
Falsehoods Programmers Believe About Time and its sequel, More Falsehoods Programmers Believe About Time. These get into some really nitty-gritty on programming time edge cases and weird things to keep in mind as you go. Lovely stuff, and if you haven’t read them yet (or even if you haven’t read them recently), you should go and give 'em a read now.Itō-Taylor expansion
https://danmackinlay.name/notebook/stochastic_taylor_expansion.html
Thu, 15 Oct 2020 13:38:07 +1100https://danmackinlay.name/notebook/stochastic_taylor_expansion.htmlReferences Placeholder, for discussing the Taylor expansion equivalent for an SDE.
Let \(f\) denote a smooth function. Then from Itō’s lemma, \[ f\left(X_{t}\right)=f\left(X_{0}\right)+\int_{s=0}^{t} L^{0} f\left(X_{s}\right) d s+\int_{s=0}^{t} L^{1} f\left(X_{s}\right) d B_{s} \] where the operators \(L^{0}\) and \(L^{1}\) are defined by \[ L^{0}=a(x) \frac{\partial}{\partial x}+\frac{1}{2} b(x)^{2} \frac{\partial^{2}}{\partial x^{2}} \quad \text { and } \quad L^{1}=b(x) \frac{\partial}{\partial x} \] We may repeat this procedure arbitrarily many times.Gamma processes
https://danmackinlay.name/notebook/gamma_processes.html
Tue, 13 Oct 2020 15:13:34 +1100https://danmackinlay.name/notebook/gamma_processes.htmlGamma distribution Moments Multivariate gamma distribution with dependence Gamma superpositions The Gamma process Gamma bridge Time-warped gamma process Matrix gamma processes Centred gamma process As a Lévy process Gradients Gamma random field References \[\renewcommand{\var}{\operatorname{Var}} \renewcommand{\dd}{\mathrm{d}} \renewcommand{\bb}[1]{\mathbb{#1}} \renewcommand{\vv}[1]{\boldsymbol{#1}} \renewcommand{\rv}[1]{\mathsf{#1}} \renewcommand{\gvn}{\mid} \renewcommand{\Ex}{\mathbb{E}} \renewcommand{\Pr}{\mathbb{P}}\]
Gamma processes provide the classic subordinator models, i.e. non-decreasing Lévy processes. By “gamma process” in fact I mean specifically a Lévy process with gamma increments.Monte Carlo optimisation
https://danmackinlay.name/notebook/mc_opt.html
Wed, 30 Sep 2020 10:59:22 +1000https://danmackinlay.name/notebook/mc_opt.htmlReferences Optimisation via Monte Carlo Simulation. Annealing and all that. TBD.
References Abernethy, Jacob, and Elad Hazan. 2016. “Faster Convex Optimization: Simulated Annealing with an Efficient Universal Barrier.” In International Conference on Machine Learning, 2520–28. PMLR. http://proceedings.mlr.press/v48/abernethy16.html. Botev, Zdravko I., and Dirk P. Kroese. 2008. “An Efficient Algorithm for Rare-Event Probability Estimation, Combinatorial Optimization, and Counting.” Methodology and Computing in Applied Probability 10 (4, 4): 471–505.Splitting simulation
https://danmackinlay.name/notebook/splitting_simulation.html
Mon, 28 Sep 2020 10:38:21 +1000https://danmackinlay.name/notebook/splitting_simulation.htmlReferences Splitting is a method for zooming in to the important region of an intractable probability distribution
I have just spent so much time writing about this that I had better pause for a while and leave this as a placeholder.
References Aalen, Odd O., Ørnulf Borgan, and S. Gjessing. 2008. Survival and Event History Analysis: A Process Point of View. Statistics for Biology and Health.Filter design, linear
https://danmackinlay.name/notebook/filter_design_linear.html
Fri, 18 Sep 2020 10:15:52 +1000https://danmackinlay.name/notebook/filter_design_linear.htmlRelationship of discrete LTI to continuous time filters Quick and dirty digital filter design State-Variable Filters Time-varying IIR filters References Linear Time-Invariant (LTI) filter design is a field of signal processing, and a special case of state filtering that doesn’t necessarily involve a hidden state.
z-Transforms, bilinear transforms, Bode plots, design etc.
I am going to consider this in discrete time (i.e. for digital implementation) unless otherwise stated, because I’m implementing this in software, not with capacitors or whatever.Non-Gaussian Bayesian functional regression
https://danmackinlay.name/notebook/stochastic_process_regression.html
Wed, 16 Sep 2020 14:07:32 +1000https://danmackinlay.name/notebook/stochastic_process_regression.htmlReferences Regression using non-Gaussian random fields. Generalised Gaussian process regression.
Is there ever an actual need for this? Or can we just use mostly-Gaussian process with some non-Gaussian distribution marginal and pretend, via GP quantile regression, or some variational GP approximation or non-Gaussian likelihood over Guaussian latents. Presumably if we suspect higher moments than the second are important, or that there is some actual stochastic process that we know matches our phenomenon, we might bother with this, but oh my it can get complicated.Gaussian process quantile regression
https://danmackinlay.name/notebook/gp_quantile_regression.html
Wed, 16 Sep 2020 13:44:32 +1000https://danmackinlay.name/notebook/gp_quantile_regression.htmlReferences How to do quantile regression with GPs.
References Boukouvalas, Alexis, Remi Barillec, and Dan Cornford. 2012. “Gaussian Process Quantile Regression Using Expectation Propagation.” In ICML 2012. http://arxiv.org/abs/1206.6391. Reich, Brian J. 2012. “Spatiotemporal Quantile Regression for Detecting Distributional Changes in Environmental Processes.” Journal of the Royal Statistical Society: Series C (Applied Statistics) 61 (4): 535–53. https://doi.org/10.1111/j.1467-9876.2011.01025.x. Reich, Brian J., Montserrat Fuentes, and David B.Statistics of spatio-temporal processes
https://danmackinlay.name/notebook/spatio_temporal.html
Fri, 11 Sep 2020 13:30:12 +1000https://danmackinlay.name/notebook/spatio_temporal.htmlTools References The dynamics of spatial processes evolving in time.
Clearly there are many different problems one might wonder about here. I am thinking in particular of the kind of problem whose discretisation might look like this, as a graphical model.
This is highly stylized - I’ve imagined there is one spatial dimension, but usually there would be two or three. The observed notes are where we have sensors that can measure the state of some parameter of interest \(w\) which evolves in time \(t\).Online learning
https://danmackinlay.name/notebook/online_learning.html
Wed, 26 Aug 2020 16:48:40 +1000https://danmackinlay.name/notebook/online_learning.htmlMirror descent Follow-the-regularized leader Parameter-free Covariance References An online learning perspective gives bounds on the regret: the gap between in performance between online estimation and the optimal estimator when we have access to the entire data.
A lot of things are sort-of online learning; stochastic gradient descent, for example, is closely related. However, if you meet someone who claims to study “online learning” they usually mean to emphasis particular things.Simulation based inference
https://danmackinlay.name/notebook/simulation_based_inference.html
Tue, 25 Aug 2020 10:18:37 +1000https://danmackinlay.name/notebook/simulation_based_inference.htmlReferences Simulation-based inference, likelihood-free inference, and approximate Bayesian Computation are all terrible descriptions. There are many ways that inference can be based upon simulations, many types of freedom from likelihood and many ways to approximate Bayesian computation. However, all these terms together refer to a particular thing.
TBD: relationship between this and indirect inference. They look similar but tend not to cite each other. Is this a technical or sociological difference?Stochastic signal sampling
https://danmackinlay.name/notebook/signal_sampling_stochastic.html
Thu, 11 Jun 2020 06:45:08 +1000https://danmackinlay.name/notebook/signal_sampling_stochastic.htmlReferences Signal sampling is the study of approximating continuous signals with discrete ones and vice versa. What if the signal you are trying to recover is random, but you have a model for that randomness, and can thus assign likelihoods (posterior probabilities even) to some sample paths.? Now you are sampling a stochastic process.
This is a particular take on a classic inverse problem that arises in many areas, framed how electrical engineers frame it.Functional regression
https://danmackinlay.name/notebook/functional_data.html
Thu, 28 May 2020 11:17:20 +1000https://danmackinlay.name/notebook/functional_data.htmlRegression using curves Functional autoregression References Statistics where the samples are not just data but whole curves and manifolds, or subsamples from them. Function approximation meets statistics.
Regression using curves To quote Jim Ramsay:
Functional data analysis, […] is about the analysis of information on curves or functions. For example, these twenty traces of the writing of “fda” are curves in two ways: first, as static traces on the page that you see after the writing is finished, and second, as two sets functions of time, one for the horizontal “X” coordinate, and the other for the vertical “Y” coordinate.Long memory time series
https://danmackinlay.name/notebook/long_memory_processes.html
Thu, 28 May 2020 10:56:49 +1000https://danmackinlay.name/notebook/long_memory_processes.htmlReferences Hurst exponents, non-stationarity etc.
TBD.
References Beran, Jan. 1992. “Statistical Methods for Data with Long-Range Dependence.” Statistical Science 7 (4): 404–16. ———. 1994. Statistics for Long-Memory Processes. CRC Press. http://books.google.com?id=jdzDYWtfPC0C. ———. 2010. “Long-Range Dependence.” Wiley Interdisciplinary Reviews: Computational Statistics 2 (1): 26–35. https://doi.org/10.1002/wics.52. Beran, Jan, and Norma Terrin. 1996. “Testing for a Change of the Long-Memory Parameter.” Biometrika 83 (3): 627–38. https://doi.Voice fakes
https://danmackinlay.name/notebook/voice_fakes.html
Wed, 27 May 2020 20:42:58 +1000https://danmackinlay.name/notebook/voice_fakes.htmlStyle transfer Text to speech References A placeholder. Generating speech, without a speaker, or possibly style transferring speech.
Style transfer You have a recording of me saying something self-incriminating. You would prefer it to be a recording Hillary Clinton saying something incriminating. This is achievable.
There has been a tendency for the open source ones to be fairly mediocre while the the pay-to-play options leave provocative demos about but do not let you use them.Malliavin calculus
https://danmackinlay.name/notebook/malliavin_calculus.html
Mon, 25 May 2020 08:15:48 +1000https://danmackinlay.name/notebook/malliavin_calculus.htmlReferences This is actually the Northern Lights in 1883, but let us pretend it is something to do with Malliavin calculus
You can calculate a derivative of densities for stochastic processes in some generalised sense which I do not at present understand, and do the normal calculus thing you do with a derivative. Stochastic differential equations arise, presumably ones in some sense involving this generalised derivative, can then solve some kind of problems for you.Lévy stochastic differential equations
https://danmackinlay.name/notebook/levy_sdes.html
Sat, 23 May 2020 18:19:58 +1000https://danmackinlay.name/notebook/levy_sdes.htmlReferences Stochastic differential equations driven by Lévy noise are not so tidy as Itō diffusions (although they are still somewhat tidy), so they are frequently brushed aside in stochastic calculus texts. But I need ’em! There is a developed sampling theory for these creatures called sparse stochastic process theory.
Possibly also chaos expansions might be a useful tool for modelling these, and/or Malliavin calculus whatever that is.Forecasting
https://danmackinlay.name/notebook/forecasting.html
Thu, 21 May 2020 11:32:13 +1000https://danmackinlay.name/notebook/forecasting.htmlModel selection Software Tidyverse time series analysis and forecasting packages prophet Causal impact asap Micropredictions.org References Time series prediction niceties, where what needs to be predicted is the future. Filed under forecasting because in machine learning terminology, prediction is a general term that does not imply extrapolation into the future necessarily.
🏗 handball to Rob Hyndman.
Model selection Rob Hyndman explains how to cross-validate time series models that use only the lagged observations.Audio/music corpora
https://danmackinlay.name/notebook/audio_corpora.html
Tue, 19 May 2020 20:46:25 +1000https://danmackinlay.name/notebook/audio_corpora.htmlGeneral issues Other indices General audio Audioset youtube8m FSDnoisy18k FUSS Musical — complete songs and metadata Beatstars and other beat markets Free music archive Youtube music video 8m Dunya project The ballroom set MusicNet Magnatagatune Music — multitrack/stems Sisec signal separation MedleyDB Bonus: Mdbdrums Musical — just metadata Million songs AcousticBrainz Genre Music — individual notes and voices freesound4seconds nsynth Percussion SMT-drums Freesound One-Shot Percussive Sounds Other well-known science-y music datasets Music — MIDI Voice Dance References Datasets of sound tend to be called audio corpora for reasons of tradition.Stochastic differential equations
https://danmackinlay.name/notebook/stochastic_differential_equations.html
Mon, 18 May 2020 12:23:18 +1000https://danmackinlay.name/notebook/stochastic_differential_equations.htmlReferences Placeholder.
SDEs are time-indexed, causal stochastic processes which notionally integrate an ordinary differential equation over some driving noise. As seen in state filters, optimal control, financial mathematics etc.
Terminology problem: when people talk about these they often mean, conceptually, stochastic integral equations, in the sense that the driving noise process is an integrator. When you differentiate the noise process, it leads, AFAICT to Malliavin calculus.Contact tracing
https://danmackinlay.name/notebook/contact_tracing.html
Sun, 10 May 2020 07:21:14 +1000https://danmackinlay.name/notebook/contact_tracing.htmlDr. Evans, 1917, How to keep well
Privacy-respecting computing approaches are getting important in this time of epidemics.
A recent round up by Patrick Howell O'Neill, Tate Ryan-Mosley and Bobbie Johnson lists some of the apps in action.
John Langford:
For the following a key distinction to understand is between proximity and location approaches. In proximity approaches (such as DP3T, TCN, MIT PACT(*), Apple or one of the UW PACT(*) protocols which I am involved in) smartphones use Bluetooth low energy and possibly ultrasonics to discover other smartphones nearby.Likelihood free inference
https://danmackinlay.name/notebook/likelihood_free_inference.html
Wed, 22 Apr 2020 17:36:41 +1000https://danmackinlay.name/notebook/likelihood_free_inference.htmlReferences Finding the target without directly inspecting the likelihood of the current guess
A terrible term which seems to have a couple of distinct uses; I do not yet understand which are the same.
I mean this in the sense of trying to approximate intractable likelihoods; there seems also to be a school which would like to use this term for methods which make no reference to probability densities whatever.Particle filters
https://danmackinlay.name/notebook/particle_filters.html
Wed, 08 Apr 2020 10:50:05 +1000https://danmackinlay.name/notebook/particle_filters.htmlFeynman-Kac formulae Weird evolution equations Miscellaneous practical introductions Tooling References A field of study concerning certain kinds of stochastic processes. The easiest entry point is IMO to think about randomised generalisation of state filter models. This has nothing to to with filters for particulate matter as seen in respirators.
There is too much confusing and unhelpful terminology here, and I am only at the fringe of this field so I will not attempt to typologize.Queueing
https://danmackinlay.name/notebook/queueing.html
Mon, 06 Apr 2020 22:09:14 +1000https://danmackinlay.name/notebook/queueing.htmlReferences Not much to say here right now, except that I always forget the name of the useful tool from queuing theory, Kingman’s approximation for waiting time.
\[ \mathbb {E} (W_{q})\approx \left({\frac {\rho }{1-\rho }}\right)\left({\frac {c_{a}^{2}+c_{s}^{2}}{2}}\right)\tau \] where τ is the mean service time (i.e. μ = 1/τ is the service rate), λ is the mean arrival rate, ρ = λ/μ is the utilization, \(c_a\) is the coefficient of variation for arrivals (that is the standard deviation of arrival times divided by the mean arrival time) and \(c_s\) is the coefficient of variation for service times.Epidemics
https://danmackinlay.name/notebook/epidemics.html
Fri, 03 Apr 2020 12:19:15 +1100https://danmackinlay.name/notebook/epidemics.htmlModeling Monitoring it Contact tracing References Buy this from sam.
A grab-bag of links about disease spread in its messy glory.
[Microbescope by David McCandless, Omid Kashan, Miriam Quick, Karl Webster, Dr Stephanie Starling]
The spread of diseases in populations. A nitty-gritty messy empirical application for those abstract contagion models.
Connection with global trade networks: Cosma Shalizi on Ebola and Mongol Modernity.Nonparametrically learning spatiotemporal systems
https://danmackinlay.name/notebook/nn_spatiotemporal.html
Thu, 02 Apr 2020 17:26:01 +1100https://danmackinlay.name/notebook/nn_spatiotemporal.htmlReferences On learning stochastic partial differential equations and other processes using neural networks, gaussian processes and other differentiable techniques. Uses the tools of dynamical NNs and their ilk. Probably handy for machine learning physics.
I know little about this yet. But here are some links
References Arridge, Simon, Peter Maass, Ozan Öktem, and Carola-Bibiane Schönlieb. 2019. “Solving Inverse Problems Using Data-Driven Models.” Acta Numerica 28 (May): 1–174.Effective sample size
https://danmackinlay.name/notebook/effective_sample_size.html
Tue, 03 Mar 2020 12:14:58 +1100https://danmackinlay.name/notebook/effective_sample_size.htmlStatistics Monte Carlo estimation References We have an estimator \(\hat{\theta}}\) of some statistic which is, for the sake of argument, presumed to be the mean calculated from observations of some stochastic process \(\mathsf{v}\). Under certain assumptions we can use central limit theorems to find that the variance of our estimator calculated from \(N\) i.i.d. samples is given by \(\operatorname{Var}(\hat{\theta}})\propto 1/N.\) Effective Sample Size (ESS) gives us a different \(N\), \(N_{\{text{Eff}}\) such that \(\operatorname{Var}(\hat{\theta}})\propto 1/N_{\{text{Eff}}.Cepstral transforms and harmonic identification
https://danmackinlay.name/notebook/cepstrum.html
Thu, 13 Feb 2020 19:19:46 +1100https://danmackinlay.name/notebook/cepstrum.htmlReferences See also machine listening, system identification.
The cepstrum of a time series takes the represents the power-spectrogram using a log link function. I haven’t actually read the foundational literature here (e.g. Bogert, Healy, and Tukey 1963), merely used some algorithms; but it seems to be mostly a hack for rapid identification of correlation lags where said lags are long.
For a generalized modern version, see Proietti and Luati (2019).Random change of time
https://danmackinlay.name/notebook/change_of_time.html
Mon, 10 Feb 2020 14:52:59 +1100https://danmackinlay.name/notebook/change_of_time.htmlTo explore Subordinator Point process rate transform Going Multivariate References 🏗 Various notes on a.e. continuous monotonic random changes of index in order build new processes.
In Warping and registration problems you try to align two or more processes; this can sometimes be an alignment problem, but not not necessarily.
To explore Lamperti representation for continuous state branching processes,
Ogata’s time rescaling: Intensity estimation for point processes uses this as a statistical test.Cascade models
https://danmackinlay.name/notebook/cascade_models.html
Mon, 10 Feb 2020 09:28:49 +1100https://danmackinlay.name/notebook/cascade_models.htmlReferences \(\newcommand{\rv}[1]{\mathsf{#1}}\)
Models for, loosely, the total population size arising from all generations the offspring of some progenitor.
Let us suppose that each individual \(i\) who catches a certain strain of influenza will go on to infect a further \(\rv{n}_i\sim F\) others. Assume the population is infinite, that no one catches influenza twice and that the number of transmission of the disease is distributed the same for everyone who catches it.Branching processes
https://danmackinlay.name/notebook/branching_processes.html
Fri, 07 Feb 2020 17:33:31 +1100https://danmackinlay.name/notebook/branching_processes.htmlTo learn We do not care about time Discrete index, discrete state, Markov: The Galton-Watson process Continuous index, discrete state: the Hawkes Process Continuous index, continuous state Parameter estimation Discrete index, continuous state Special issues for multivariate branching processes Classic data sets Implementations References A diverse class of stochastic models that I am mildly obsessed with, where over some index set (usually time, space or both) there are distributed births of some kind, and we count the total population.Survival analysis and reliability
https://danmackinlay.name/notebook/survival_analysis.html
Wed, 05 Feb 2020 14:08:55 +1100https://danmackinlay.name/notebook/survival_analysis.htmlEstimating survival rates Life table method Nelson-Aalen estimates Other reliability stuff tools References Estimating survival rates Here’s the set-up: looking at a data set of individuals’ lifespans you would like to infer the distributions—Analysing when people die, or things break etc. The statistical problem of estimating how long people’s lives are is complicated somewhat by the particular structure of the data — loosely, “every person dies at most one time”, and there are certain characteristic difficulties that arise, such as right-censorship.