Intuitively speaking, ensuring that if our prediction is 80% certain, that we are wrong as close to 20% of the time as possible. But also the same for all other certainties.
I do not know much about this but I could probably start from the compact lit review in Gneiting and Raftery (2007).
Gneiting, Tilmann, and Adrian E Raftery. 2007. “Strictly Proper Scoring Rules, Prediction, and Estimation.” Journal of the American Statistical Association
102 (477): 359–78.
Henzi, Alexander, Xinwei Shen, Michael Law, and Peter Bühlmann. 2023. “Invariant Probabilistic Prediction.”
Pacchiardi, Lorenzo, and Ritabrata Dutta. 2022. “Generalized Bayesian Likelihood-Free Inference Using Scoring Rules Estimators.” arXiv:2104.03889 [Stat]
Székely, Gábor J., and Maria L. Rizzo. 2013. “Energy Statistics: A Class of Statistics Based on Distances.” Journal of Statistical Planning and Inference
143 (8): 1249–72.