Categorical random variates
February 20, 2017 — January 12, 2022
Distributions over categories.
1 Stick breaking tricks
Recommended reading: Machine Learning Trick of the Day (6): Tricks with Sticks— Shakir Mohammed.
TBC.
2 via random measures
See random measures.
3 Gumbel-max
See Gumbel-max tricks.
4 Pólya-Gamma augmentation
See Pólya-Gamma.
5 Softmax models
TBC.
6 Multicategorical distributions
Can something belong to many categories? Then we are probably looking for Paintbox models (Broderick, Pitman, and Jordan 2013; Zhang and Paisley 2019) or some kind of multivariate Bernoulli model.
7 Dirichlet distribution
TBD. See Dirichlet distributions.
8 Dirichlet process
TBD. A distribution over an unknown number of categories. See also Gamma processes, which is how I learned to understand Dirichlet processes, insofar as I do.
9 Parametric distributions over non-negative integers
See count models.
10 Ordinal
If there is a natural ordering to the categories, then we are in a weird place. TBC.
11 Calibration
In the context of binary classification, calibration refers to the process of transforming the output scores from a binary classifier to class probabilities. If we think of the classifier as a “black box” that transforms input data into a score, we can think of calibration as a post-processing step that converts the score into a probability of the observation belonging to class 1.
The scores from some classifiers can already be interpreted as probabilities (e.g. logistic regression), while the scores from some classifiers require an additional calibration step before they can be interpreted as such (e.g. support vector machines).
He recommends the tutorial Huang et al. (2020) and associated github.
More general probabilistic calibration here.
12 Hierarchical
TBD