Algorithmic collective action

People, algorithms, coalitions, and the strategic dynamics between them

2026-06-15 — 2026-06-15

Wherein User-Collective Leverage Over Platform Retraining Is Set Against the Performative Power of Monopolists, and Stable Agent Coalitions Are Derived via Nucleolus-Based Reinforcement Learning.

agents
AI safety
bounded compute
collective knowledge
computers are awful together
distributed
economics
edge computing
extended self
game theory
incentive mechanisms
machine learning
networks
Figure 1

Coalition game and collective action problems, played between people and algorithms.

1 Collective action against a model

A collective of users pools their data and coordinates how each member modifies it, to push a platform’s deployed learning algorithm where the collective wants it (Hardt et al. 2024). Even a tiny fractional collective can exert outsized control over what the model learns, because the firm keeps retraining on data the collective has shaped.

The reciprocal quantity is performative power (Hardt, Jagadeesan, and Mendler-Dünner 2022): how far the platform can steer the population back, rather than settling for the best fit to whatever data it currently sees. A monopolist has the most of it; competition and outside options erode it. The collective’s leverage and the platform’s performative power jointly determine who steers whom. For more on this see strategic classification and performative prediction.

2 Coalitional MARL

A coalition game asks who teams up with whom and how the joint surplus is split. If we could teach algorithms to do collective formation with one another, that would be interesting, don’t you think?

We might recall that the classical answer to “what is the worth of this coalition?” comes from Shapley fairness, which cashes out optimal coalitions in the characteristic function \(v(S)\) — the worth of each coalition \(S\), evaluated in isolation. For more than ~20 agents that’s hopeless: evaluating optimal coalitions involves \(O(2^n)\) subproblems, each of which might be hard in itself (e.g. each \(v(S)\) is its own counterfactual estimation problem). Scaling to a thousand agents would mean evaluating \(10^{301}\) such problems. Can we learn approximations to optimal coalitions using the RL formalism?

2.1 Decentralised partner selection and gain-sharing

Multi-Agent Gain Sharing (MAGS) pushes the feasible agent count up by learning an approximation to the characteristic function. It trains (transformer-based) agents to negotiate fair gain shares directly in mixed-motive logistics environments, notionally scaling to >1000 concurrent agents (Mak et al. 2023).

(Chalkiadakis2003Coordination?) treats unknown opponent capabilities as a POMDP and has agents do Bayesian belief updates over opponent types during coalition formation, which lets them anticipate (and sometimes exploit) irrational or static partners.

Both approaches experience a failure mode called transductivity: agents learn to coordinate with the opponents they trained against, not with arbitrary new ones (which would be inductivity).

2.2 Credit assignment via the nucleolus

A naïve way of doing MARL is to force a grand coalition — all agents share one global reward — and rely on credit-assignment heuristics like COMA (Foerster2018Counterfactual?) to back out individual contributions. In asymmetric environments (e.g. few cooperators vs. many adversaries in SMAC) the grand coalition might be suboptimal. Li et al. (2025) import the concept of the nucleolus from cooperative game theory into Q-learning and approximate that. For an allocation \(x\), a coalition \(S\)’s excess is \(e(S, x) = v(S) - \sum_{i \in S} x_i\) — the gap between what \(S\) could earn alone and what its members get under \(x\). Positive excess means there are grounds to defect. The nucleolus is the allocation that makes the most-aggrieved coalition as un-aggrieved as possible, then the next, and so on. A nucleolus-Q operator with convergence guarantees lets agents autonomously fracture the grand coalition into smaller, task-specific sub-teams, each stable in the sense that no member wants to defect.

2.3 Model-based MARL for stable partitions

Traditionally, MARL is model-free. Model-based MARL — where agents learn (or are given) a predictive model of environment dynamics — seems like it might garner us something the model-free version can’t: theoretical stability guarantees.

This does not seem to be commonly used, but there is a specific example in radar-network multitarget tracking (Xiong et al. 2023). Several geographically separated radar stations, each with limited beam-time, jointly track many more moving targets than any one station can cover. Two or three radars pointing at the same target triangulate to a lower-variance estimate than any one alone — (i.e. cooperation has benefits). A coalition is the subset of radars agreeing to illuminate one target; there is an implicit global partition that assigns radars to targets. We are in a “Transferable utility” setting, which means the joint tracking-quality improvement is a scalar reward that may be allocated among coalition members. No central control dictates assignments; each radar runs a model-based RL policy that learns the value of joining one coalition versus another, and the authors prove convergence to a Nash-stable partition at which no single radar wants to defect.

The model-based part seems important. Nash-stability is a statement about counterfactual defections — for every alternative coalition each radar could switch to, its simulated payoff must be no better than staying put. Model-free MARL only has reliable value estimates along trajectories it actually sampled, so the off-policy counterfactuals are guesses, and the stability claim looks much harder. A learned dynamics model lets each radar roll out counterfactual predictions along the lines of “what if I retargeted at \(k’\)”, without physically doing so, and check the deviation locally. The pattern (model-constrained dynamics + transferable-utility coalitions + model-based RL → stability proof) probably generalizes beyond radar.

2.4 Noisy observation of partner intent

In a mixed-motive game where every agent is purely selfish, MARL notoriously finds a mutually-defecting equilibrium and stays there. One fix is to hard-code prosocial weights — give each agent a reward of the form \(r_i + \alpha \sum_{j \neq i} r_j\), with \(\alpha\) tuned so cooperation pays. That works, apparently, but is not satisfying. It bakes the cooperation in rather than letting it be discovered; it tends to be fragile to defectors, doesn’t generalize off-distribution, and gives no principled story for who should cooperate with whom.

The Randomized Uncertain Social Preferences (RUSP) framework (Baker2020Emergent?) solves this by training agents across a distribution of prosocial weights, where each agent only gets a noisy observation of its own weights and no information about others’. At each episode start, we sample a reward-transformation matrix \(W\), where \(W_{ij}\) is the weight agent \(i\) puts on agent \(j\)’s reward, then have agent \(i\)’s shaped reward be \(\sum_j W_{ij} r_j\). Each agent observes only a noisy version of its own row (so it isn’t sure how much it should care about each partner), and nothing about other agents’ rows (so it can’t tell which partners care about it). Training is across the whole distribution of \(W\)s, not a single fixed one.

Because partner intent isn’t observable, the only way an agent can do well in expectation is to read intent from behaviour — cooperate cautiously, escalate cooperation when reciprocated, punish when defected against. The reported emergent phenomena are: direct reciprocity (tit-for-tat-ish responses to recent behaviour), indirect reputation tracking (treating an agent based on how it treated third parties), and stable in-episode team formation (subsets of agents settling into cooperative pairs while the rest go it alone). None of these are programmed in; they are policies that generalize across the \(W\)-distribution. Unlike fixed-prosocial-weights training, the policies reportedly transfer to held-out partner mixes without retraining.

3 Open-source game theory

MARL is one possible operationalization of open-source game theory, in which agents exchange policy source code (or some interpretable proxy) before acting and cooperate by mutual verification.

4 References

Chalkiadakis. 2007. A Bayesian Approach to Multiagent Reinforcement Learning and Coalition Formation Under Uncertainty.”
Dong, Roth, Schutzman, et al. 2018. Strategic Classification from Revealed Preferences.” In Proceedings of the 2018 ACM Conference on Economics and Computation.
Hardt, Jagadeesan, and Mendler-Dünner. 2022. Performative Power.”
Hardt, Mazumdar, Mendler-Dünner, et al. 2024. Algorithmic Collective Action in Machine Learning.”
Hardt, Megiddo, Papadimitriou, et al. 2016. Strategic Classification.” In Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science.
Levanon, and Rosenfeld. 2021. Strategic Classification Made Practical.” In Proceedings of the 38th International Conference on Machine Learning.
Li, Cao, Qiao, et al. 2025. Nucleolus Credit Assignment for Effective Coalitions in Multi-Agent Reinforcement Learning.” In.
Mak, Xu, Pearce, et al. 2023. Fair Collaborative Vehicle Routing: A Deep Multi-Agent Reinforcement Learning Approach.”
Miller, Milli, and Hardt. 2019. “Strategic Classification Is Causal Modeling in Disguise.” In International Conference on Machine Learning.
Pant, and Yu. 2026. Coopetition-Gym V1: A Formally Grounded Platform for Mixed-Motive Multi-Agent Reinforcement Learning Under Strategic Coopetition.”
Sharma, Fernandez, Zaroukian, et al. 2021. Survey of Recent Multi-Agent Reinforcement Learning Algorithms Utilizing Centralized Training.” In Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications III.
Weis, Wołczyk, Nasser, et al. 2026. Multi-Agent Cooperation Through in-Context Co-Player Inference.”
Wolpert, and Lawson. 2002. Designing Agent Collectives for Systems with Markovian Dynamics.” In.
Xiong, Zhang, Cui, et al. 2023. Coalition Game of Radar Network for Multitarget Tracking via Model-Based Multiagent Reinforcement Learning.” IEEE Transactions on Aerospace and Electronic Systems.