# Model selection

Scaling laws for very large neural nets
Compute/size/data tradeoffs
2021-01-14
– 2023-02-16Bayesian model selection by model evidence maximisation
Type II maximum likelihood, marginal maximum likelihood, Bayes Occam’s razor
2017-08-20
– 2022-12-22Bayesian sparsity
2019-01-08
– 2022-10-25Randomised linear algebra
2016-08-16
– 2022-10-22Neural tangent kernel
2020-12-09
– 2022-10-14Multi-objective optimisation
2021-07-14
– 2022-10-10Forecasting
Vegan haruspicy
2015-06-16
– 2022-10-08Penalised/regularised regression
2016-06-23
– 2022-09-19Overparameterization in large models
Improper learning, benign overfitting, double descent
2018-04-04
– 2022-05-27Forecasting with model averaging
Mixture of experts, ensembles and time series
2022-05-04Hypothesis tests, statistical
2014-08-23
– 2022-01-27Hyperparameter optimization
Replacing a hyperparameter problem with a hyperhyperparameter problem which feels like progress I guess
2020-09-25
– 2021-10-20Gradient descent at scale
Practical implementation of large optimisations
2021-07-14
– 2021-09-28Regularising neural networks
Generalisation for street fighters
2017-02-12
– 2021-09-24Fractals and self-similarity
2011-11-13
– 2021-09-22Meta learning
Few-shot learning, learning fast weights, learning to learn
2021-09-16Sequential experiments
Especially multiple sequential experiments
2021-08-04Random-forest-like methods
A selection of randomly stopped clocks is never far from wrong.
2015-09-23
– 2021-06-17Stein’s method
His eyes are like angels but his heart is cold / No need to ask / He’s a Stein operator
2021-03-12
– 2021-06-01Cross validation
2016-09-05
– 2021-05-13Infinite width limits of neural networks
2020-12-09
– 2021-05-11Compressing neural nets
pruning, compacting and otherwise fitting a good estimate into fewer parameters
2016-10-14
– 2021-05-07Differentiable model selection
Differentiable hyperparameter search, and architecture search, and optimisation optimisation by optimisation and so on
2020-09-25
– 2021-04-13Generically approximating probability distributions
2021-03-12
– 2021-03-22Matrix measure concentration inequalities and bounds
2014-11-25
– 2021-03-08Measure concentration inequalities
On being 80% sure I am only 20% wrong
2014-11-25
– 2021-03-04Weighted data in statistics
2020-11-04
– 2020-11-06Automatic design of experiments
Minesweeper++
2017-04-11
– 2020-10-13Sparse model selection
2016-09-05
– 2020-10-02AutoML
2017-07-17
– 2020-10-02Data summarization
On maps drawn at smaller than 1:1 scale
2019-01-14
– 2020-09-18Independence, conditional, statistical
2016-04-21
– 2020-09-13Minimum description length
2020-08-06Model complexity penalties
Information criteria, degrees of freedom etc
2015-04-22
– 2020-06-22Long memory time series
2011-11-13
– 2020-05-28Model averaging
On keeping many incorrect hypotheses and using them all as one goodish one
2017-06-20
– 2020-03-22Kernel approximation
2016-07-27
– 2020-03-06Effective sample size
2016-11-21
– 2020-03-03Convergence of random variables
2019-12-03Phase retrieval
I’ve got the power. / Like the crack of the whip/ I snap attack/ Front to back
2017-01-16
– 2019-11-07Sparse regression
2016-06-23
– 2019-10-24Statistical learning theory for time series
2016-11-03
– 2019-10-01Bayesian model selection
2017-08-20
– 2019-07-22Wacky regression
2015-09-23
– 2019-05-02Nearly sufficient statistics
How about “Sufficient sufficiency”? — is that taken?
2018-03-13
– 2019-01-14Multiple testing
2015-04-22
– 2018-11-05Post-selection inference
Adaptive data analysis without cheating
2017-08-20Model/hyperparameter selection
2016-04-15
– 2017-08-20Compressed sensing and sampling
A fancy ways of counting zero
2014-08-18
– 2017-06-14Stability (in learning)
2016-05-25
– 2016-10-05Statistical learning theory
Eventually including structural risk minimisation, risk bounds, hopefully-uniform convergence rates, VC-dimension, generalisation-and-stability framings etc
2016-07-06
– 2016-08-16