Bayesian epistemics
Information elicitation, incentive mechanisms for truth, proper scoring rules…
2025-05-05 — 2025-08-20
Suspiciously similar content
An interesting inverse design question: how should I design a system to optimise for truthfulness? Brief summary here of truth-alignment problems.
@Frongillo2024Recent:
This note provides a survey for the Economics and Computation community of some recent trends in the field of information elicitation. At its core, the field concerns the design of incentives for strategic agents to provide accurate and truthful information. Such incentives are formalized as proper scoring rules, and turn out to be the same object as loss functions in machine-learning settings, providing many connections. More broadly, the field concerns the design of mechanisms to obtain information from groups of agents and aggregate it or use it for decision making. Recently, work on information elicitation has expanded and been connected to online no-regret learning, mechanism design, fair division, and more.
One of many topics on which I am not an expert but find it useful to track. This one is more useful than many to track because the scientific literature on this area is opaque and the popular stuff is of highly variable quality, e.g. the rationalists like to talk off-handedly about prediction markets but every time I try to chase that concept down I find myself deep in a lesswrong.org rabbit hole and become discouraged.
I am currently reading, and strongly recommend, Chapter 2 of @Neyman2024Algorithmic which is a well-written introduction to many of the concepts in this area. In particular, it explains beautifully the connection between Bregman divergences and proper scoring rules, and incentive mechanisms for truthfulness. Potential applications to epistemic communities, prediction markets etc are immediate.
1 Communicative action and truth-elicitation
Quick notes about some cool tricks in alignment-upon-truth.
1.1 Setting
- Each agent \(i\) privately observes a signal \(s_i\) about a state \(\omega\).
- We can pay agents using a scoring/payment rule \(S_i(\cdot)\) and then choose a downstream action \(a=\delta(m)\) based on reported messages \(m=(m_1,\dots,m_n)\).
- Planner’s objective is \(W(a;\theta)\) (or a principal payoff \(\Pi(a;\theta)\)).
Sending a message \(m_i\) is an action—one that is informed by beliefs or observations.
1.2 Elicitation with verification: proper scoring rules
If a verifiable outcome \(y\) (or a gold-label) eventually arrives, we can reward probabilistic reports \(p_i\) with a strictly proper scoring rule \(S(p_i,y)\) so that truthful beliefs uniquely maximize expected score [@Gneiting2007Strictly]. Formally, for any belief \(q_i\) about \(y\), \[ \mathbb{E}_{y\sim q_i}[S(q_i,y)] \;\ge\; \mathbb{E}_{y\sim q_i}[S(p_i,y)] \quad \forall p_i, \] with equality iff \(p_i=q_i\). This gives communication-Incentive-Compatible (CIC) for beliefs. Classic examples include the log score and Brier score [@Gneiting2007Strictly].
That connects us also to calibration.
1.3 Elicitation without verification (IEWV)
When no ground truth ever arrives, we can still incentivize information via peer prediction and truth-serum mechanisms:
- Peer Prediction (PP): pay agents using others’ reports so that truthful reporting is a Bayes-Nash equilibrium [@Miller2005PeerPrediction; @Witkowski2012EC].
- Bayesian Truth Serum (BTS) and variants: reward reports and meta-predictions about the distribution of others’ reports; truthful reporting is incentivized under mild common-prior/correlation conditions [@Prelec2004BTS; @Prelec2017Solution].
- Robust BTS / Minimal PP: relax prior/knowledge assumptions or handle non-binary/finite-sample regimes [@Witkowski2012RBTS; @Radanovic2013RBTS; @Dasgupta2013Endogenous].
These mechanisms create CIC about signals (truthful mapping \(s_i \mapsto m_i\)) even when \(y\) is never observed.
1.4 From elicitation to decisions: informational alignment
We may ultimately care about actions taken based on our beliefs. Let \(\delta(m)\) be the aggregation/decision map that turns messages into an action. Two complementary targets:
Belief alignment (posterior target). Aggregation yields (in expectation) the same posterior we would hold if we saw the true signal profile: \[\text{Elicited posterior } \hat\mu(\cdot\mid m) \approx \mu(\cdot\mid s).\] A natural discrepancy is a probability distance, e.g. \(\mathrm{KL}(\mu\,\|\,\hat\mu)\) or a Bregman divergence tied to the scoring rule [@Gneiting2007Strictly].
Decision alignment (action target). The induced action maximizes the planner’s objective given the information: \[a^\star(s)\in\arg\max_a W(a;\theta) \quad\text{and}\quad \delta(m^\star(s))=a^\star(s),\] where \(m^\star(s)\) is the equilibrium communicative action under the elicitation payments. This is the informational analogue of our earlier action-IC → implementation → alignment pipeline.
Put differently: CIC gives us truthful information, implementation ensures the decision rule uses it, and alignment checks that the resulting action is planner-optimal.
1.5 Relaxations and metrics (brief)
- ε-CIC (elicitation slack): bound the max gain from misreporting under the scoring rule by \(\varepsilon\) (utility units).
- Information regret: expected score gap between truthful and actual reports (equals a Bregman divergence under proper scoring) [@Gneiting2007Strictly].
- Posterior divergence: distance between elicited and Bayes posteriors (KL/TV/Wasserstein).
- End-to-end regret: welfare gap \(W(a^\star(s)) - W(\delta(m))\) after aggregation, exactly like our earlier action-level regret.
1.6 Communication vs persuasion (design choices)
Not all communication seeks truth. Bayesian persuasion designs signal structures to move the receiver’s action toward the sender’s objective [@Kamenica2011BP]. Cheap talk shows when informative communication is impossible or coarse [@Crawford1982Strategic]. The elicitation view is the opposite: we design payments so that truthful beliefs and observations become the sender’s best action, then we align the downstream decision.