Bayesian epistemics
Information elicitation, incentive mechanisms for truth, proper scoring rules…
2025-05-05 — 2026-04-24
Wherein proper scoring rules are set forth as the instrument by which truthful belief-reports are elicited from strategic agents, and their correspondence with machine-learning loss functions is duly observed.
An interesting inverse-design question: how should I design a system to optimize for truthfulness? A brief summary of truth-alignment problems.
Frongillo and Waggoner (2024):
This note provides a survey for the Economics and Computation community of some recent trends in the field of information elicitation. At its core, the field concerns the design of incentives for strategic agents to provide accurate and truthful information. Such incentives are formalized as proper scoring rules, and turn out to be the same object as loss functions in machine-learning settings, providing many connections. More broadly, the field concerns the design of mechanisms to obtain information from groups of agents and aggregate it or use it for decision making. Recently, work on information elicitation has expanded and been connected to online no-regret learning, mechanism design, fair division, and more.
One of many topics on which I am not an expert but find it useful to track. This one is more useful than many to track because the scientific literature on this area is opaque and the popular stuff is of highly variable quality, e.g. the rationalists like to talk off-handedly about prediction markets but every time I try to chase that concept down I find myself deep in a lesswrong.org rabbit hole and become discouraged.
I am currently reading, and strongly recommend, Chapter 2 of Neyman (2024) which is a well-written introduction to many of the concepts in this area. In particular, it explains beautifully the connection between Bregman divergences and proper scoring rules, and incentive mechanisms for truthfulness. Potential applications to epistemic communities, prediction markets etc are immediate.
1 Communicative action and truth-elicitation
Quick notes about some cool tricks in alignment-upon-truth.
1.1 Setting
- Each agent \(i\) privately observes a signal \(s_i\) about a state \(\omega\).
- We can pay agents using a scoring/payment rule \(S_i(\cdot)\) and then choose a downstream action \(a=\delta(m)\) based on reported messages \(m=(m_1,\dots,m_n)\).
- Planner’s objective is \(W(a;\theta)\) (or a principal payoff \(\Pi(a;\theta)\)).
Sending a message \(m_i\) is an action—one that is informed by beliefs or observations.
1.2 Elicitation with verification: proper scoring rules
If a verifiable outcome \(y\) (or a gold-label) eventually arrives, we can reward probabilistic reports \(p_i\) with a strictly proper scoring rule \(S(p_i,y)\) so that truthful beliefs uniquely maximize expected score (Gneiting and Raftery 2007). Formally, for any belief \(q_i\) about \(y\), \[ \mathbb{E}_{y\sim q_i}[S(q_i,y)] \;\ge\; \mathbb{E}_{y\sim q_i}[S(p_i,y)] \quad \forall p_i, \] With equality iff \(p_i=q_i\). This gives communication-Incentive-Compatible (CIC) incentives for beliefs. Classic examples include the log score and the Brier score (Gneiting and Raftery 2007).
That also connects us to calibration.
1.3 Elicitation without verification (IEWV)
When no ground truth ever arrives, we can still incentivize information using peer prediction and truth-serum mechanisms. I know nothing useful or practical about these yet; here are the dot points though:
- Peer Prediction (PP): pay agents using others’ reports so truthful reporting is a Bayes–Nash equilibrium (Miller, Resnick, and Zeckhauser 2005; Witkowski and Parkes 2012a).
- Bayesian Truth Serum (BTS) and variants: reward reports and meta-predictions about the distribution of others’ reports—truthful reporting is incentivized under mild common-prior/correlation conditions (Prelec 2004; Prelec, Seung, and McCoy 2017).
- Robust BTS / Minimal PP: relax prior/knowledge assumptions or handle non-binary or finite-sample regimes (Witkowski and Parkes 2012b; Radanovic and Faltings 2013; Dasgupta and Ghosh 2013).
These mechanisms create CIC about signals (truthful mapping \(s_i \mapsto m_i\)) even if \(y\) is never observed.
1.4 From elicitation to decisions: informational alignment
We may ultimately care about the actions taken based on our beliefs. Let \(\delta(m)\) be the aggregation/decision map that turns messages into actions. Two complementary targets:
Belief alignment (posterior target). Aggregation yields, in expectation, the same posterior we would hold if we saw the true signal profile: \[\text{Elicited posterior } \hat\mu(\cdot\mid m) \approx \mu(\cdot\mid s).\] A natural discrepancy is a probability distance, e.g. \(\mathrm{KL}(\mu\,\|\,\hat\mu)\) or a Bregman divergence tied to the scoring rule (Gneiting and Raftery 2007).
Decision alignment (action target). The induced action maximizes the planner’s objective given the information: \[a^\star(s)\in\arg\max_a W(a;\theta) \quad\text{and}\quad \delta(m^\star(s))=a^\star(s),\] where \(m^\star(s)\) denotes the equilibrium communicative action under the elicitation payments. This is the informational analogue of our earlier action-IC → implementation → alignment pipeline.
Put differently: CIC gives us truthful information, implementation ensures the decision rule uses it, and alignment checks that the resulting action is planner-optimal.
1.5 Relaxations and metrics (brief)
- ε-CIC (elicitation slack): bound the maximum gain from misreporting under the scoring rule by \(\varepsilon\) (in utility units).
- Information regret: expected score gap between truthful and actual reports (equals a Bregman divergence under proper scoring) (Gneiting and Raftery 2007).
- Posterior divergence: distance between elicited and Bayesian posteriors (KL/TV/Wasserstein).
- End-to-end regret: welfare gap \(W(a^\star(s)) - W(\delta(m))\) after aggregation, exactly like our earlier action-level regret.
1.6 Communication vs persuasion (design choices)
Not all communication seeks the truth. Bayesian persuasion designs signal structures to move the receiver’s action toward the sender’s objective (Kamenica and Gentzkow 2011). Cheap talk shows when informative communication is impossible or coarse (Crawford and Sobel 1982). The elicitation view flips that: we design payments so truthful beliefs and observations become the sender’s best action, and then we align the downstream decision.
