Design of multi-agent systems

Distributed sensing, swarm sensing, adaptive social learning, multi-agent adaptation, iterated game theory with learning etc

2014-10-13 — 2025-05-05

Wherein the design of multi-agent systems is considered, with emphasis on the crafting of private utility functions to induce cooperative coalitions, and constraints from local information are examined.

agents
AI safety
bounded compute
collective knowledge
computers are awful together
distributed
economics
edge computing
extended self
game theory
incentive mechanisms
machine learning
networks
Figure 1

This is a hub page for work on designing agents to get things done collectively. Several specialisations have their own pages; the connective tissue and cross-cutting concerns live here.

1 Cooperation and opponent shaping

How do learning agents figure each other out and (sometimes) learn to cooperate? See opponent shaping for the formalism of agents that model and influence each other’s learning, and learning with theory of mind for the broader framing. For the commitment and contracting angle, see commitment, contracts, cooperation.

2 Collective utility and mechanism design

If you have autonomous agents that need to cooperate, how do you design their private utility? This looks a bit like collective decisions, but I am thinking of incentive design where we get to choose the utility function, not just the mechanism — an inverse collective action problem. See also the suggestive but indirect mapping between utility and fitness.

3 Local versus global design

What global organising can be done using only local information? There are formal approaches to this, e.g. H. Wang and Rubenstein (2020), and more playful ones like Mordvintsev et al. (2020), who use cellular automata as a case study. See also differentiable collective automata for the gradient-based version.

4 Value learning and assistance games

If you’re watching an agent and want to infer its reward function, see value/reward learning. The cooperative variant — where one agent actively helps another whose goals are unknown — is the assistance games formalism.

5 Multi-agent causal models

Extending causal DAGs to include agents and decisions: see multi-agent causality.

6 Nature-inspired collectives

Biomimetic approaches: ant colonies, particle swarms, probability collectives, and other nature-inspired algorithms. Agents which can form coalitions might need distributed consistency.

7 Human systems as multi-agent systems

An important specialisation: groups of humans, where the agents are a particular type of great plains ape. See wisdom of crowds versus groupthink, or weaponised social media.

8 Tooling

For simulation environments and frameworks, see MARL tooling.

9 Incoming

10 References

Acemoglu, and Ozdaglar. 2011. Opinion Dynamics and Learning in Social Networks.” Dynamic Games and Applications.
Akbarpour, and Jackson. 2018. Diffusion in Networks and the Virtue of Burstiness.” Proceedings of the National Academy of Sciences.
Akyildiz, Su, Sankarasubramaniam, et al. 2002. A Survey on Sensor Networks.” Communications Magazine, IEEE.
Albrecht, Christianos, and Schäfer. 2024. Multi-Agent Reinforcement Learning: Foundations and Modern Approaches.
Amato. 2024. An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning.”
Barfoot. 2020. Fundamental Linear Algebra Problem of Gaussian Inference.”
Bianchi, and Jakubowicz. 2013. Convergence of a Multi-Agent Projected Stochastic Gradient Algorithm for Non-Convex Optimization.” IEEE Transactions on Automatic Control.
Bieniawski, and Wolpert. 2004. Adaptive, Distributed Control of Constrained Multi-Agent Systems.” In Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems-Volume 3.
Bishop, and Doucet. 2014. Distributed Nonlinear Consensus in the Space of Probability Measures.” IFAC Proceedings Volumes, 19th IFAC World Congress,.
Brunet, and Doolittle. 2015. Multilevel Selection Theory and the Evolutionary Functions of Transposable Elements.” Genome Biology and Evolution.
Cao, Lazaridou, Lanctot, et al. 2018. Emergent Communication Through Negotiation.”
Cattivelli, Lopes, and Sayed. 2008. “Diffusion Recursive Least-Squares for Distributed Estimation over Adaptive Networks.” IEEE Transactions on Signal Processing.
Cattivelli, and Sayed. 2009. “Diffusion LMS Strategies for Distributed Estimation.” IEEE Transactions on Signal Processing.
———. 2010. Diffusion Strategies for Distributed Kalman Filtering and Smoothing.” IEEE Transactions on Automatic Control.
Chen, and Sayed. 2012. “Diffusion Adaptation Strategies for Distributed Optimization and Learning over Networks.” IEEE Transactions on Signal Processing.
Codenotti, and Varadarajan. 2004. Efficient Computation of Equilibrium Prices for Markets with Leontief Utilities.” In ICALP.
Conitzer, and Oesterheld. 2023. Foundations of Cooperative AI.” Proceedings of the AAAI Conference on Artificial Intelligence.
Critch. 2017. Toward Negotiable Reinforcement Learning: Shifting Priorities in Pareto Optimal Sequential Decision-Making.”
Critch, Dennis, and Russell. 2022. Cooperative and Uncooperative Institution Designs: Surprises and Problems in Open-Source Game Theory.”
Dafoe, Bachrach, Hadfield, et al. 2021. Cooperative AI: machines must learn to find common ground.” Nature.
Dafoe, Hughes, Bachrach, et al. 2020. Open Problems in Cooperative AI.”
Degroot. 1974. Reaching a Consensus.” Journal of the American Statistical Association.
Deng, Papadimitriou, and Safra. 2002. On the Complexity of Equilibria.” In Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing. STOC ’02.
Di Lorenzo, and Sayed. 2013. Sparse Distributed Learning Based on Diffusion Adaptation.” IEEE Transactions on Signal Processing.
Dong, Li, Yang, et al. 2024. Egoism, Utilitarianism and Egalitarianism in Multi-Agent Reinforcement Learning.” Neural Networks.
Du, and Ding. 2021. A Survey on Multi-Agent Deep Reinforcement Learning: From the Perspective of Challenges and Applications.” Artificial Intelligence Review.
Duque, Aghajohari, Cooijmans, et al. 2025. Advantage Alignment Algorithms.” In.
Fickinger, Zhuang, Hadfield-Menell, et al. 2020. Multi-Principal Assistance Games.”
Foerster, J. 2018. Deep Multi-Agent Reinforcement Learning.”
Foerster, Jakob, Chen, Al-Shedivat, et al. 2018. Learning with Opponent-Learning Awareness.” In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. AAMAS ’18.
Foerster, Jakob, Farquhar, Afouras, et al. 2018. Counterfactual Multi-Agent Policy Gradients.” Proceedings of the AAAI Conference on Artificial Intelligence.
Franzmeyer, Malinowski, and Henriques. 2021. Learning Altruistic Behaviours in Reinforcement Learning Without External Rewards.” In.
Freeman, Yang, and Lynch. 2006. Stability and Convergence Properties of Dynamic Average Consensus Estimators.” In 2006 45th IEEE Conference on Decision and Control.
Galesic, Barkoczi, Berdahl, et al. 2022. Beyond Collective Intelligence: Collective Adaptation.”
Galesic, Barkoczi, and Katsikopoulos. 2018. Smaller Crowds Outperform Larger Crowds and Individuals in Realistic Task Conditions. Decision.
Gronauer, and Diepold. 2022. Multi-Agent Deep Reinforcement Learning: A Survey.” Artificial Intelligence Review.
Hadfield-Menell, Dragan, Abbeel, et al. 2016. “Cooperative Inverse Reinforcement Learning.” In Proceedings of the 30th International Conference on Neural Information Processing Systems. NIPS’16.
Hadfield-Menell, and Hadfield. 2018. Incomplete Contracting and AI Alignment.”
Hammond, and Adam-Day. 2025. Neural Interactive Proofs.” In.
Hammond, Chan, Clifton, et al. 2025. Multi-Agent Risks from Advanced AI.”
Ha, and Tang. 2022. Collective Intelligence for Deep Learning: A Survey of Recent Developments.” Collective Intelligence.
Havrylov, and Titov. 2017. “Emergence of Language with Multi-Agent Games: Learning to Communicate with Sequences of Symbols.”
Hernandez-Leal, Kartal, and Taylor. 2019. A Survey and Critique of Multiagent Deep Reinforcement Learning.” Autonomous Agents and Multi-Agent Systems.
Ho, Kastner, and Wong. 1978. Teams, Signaling, and Information Theory.” IEEE Transactions on Automatic Control.
Hong, and Page. 2004. Groups of Diverse Problem Solvers Can Outperform Groups of High-Ability Problem Solvers.” Proceedings of the National Academy of Sciences.
Hu, Com, and Wellman. n.d. “Nash Q-Learning for General-Sum Stochastic Games.”
Hyland, Gavenčiak, Costa, et al. 2024. Free-Energy Equilibria: Toward a Theory of Interactions Between Boundedly-Rational Agents.” In.
Ikeda. 1989. Decentralized Control of Large Scale Systems.” In Three Decades of Mathematical System Theory: A Collection of Surveys at the Occasion of the 50th Birthday of Jan C. Willems. Lecture Notes in Control and Information Sciences.
Jaques, Lazaridou, Hughes, et al. 2019. Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning.” In Proceedings of the 36th International Conference on Machine Learning.
Jiang, and Lu. 2018. Learning Attentional Communication for Multi-Agent Cooperation.” In Advances in Neural Information Processing Systems.
Jiang, Su, and Lu. 2024. Fully Decentralized Cooperative Multi-Agent Reinforcement Learning: A Survey.”
Kalai, and Lehrer. 1993. Rational Learning Leads to Nash Equilibrium.” Econometrica.
Kayaalp, Bordignon, Vlaski, et al. 2021. Hidden Markov Modeling over Graphs.” arXiv:2111.13626 [Cs, Eess].
Kirilenko, Kyle, Samadi, et al. 2011. The Flash Crash: The Impact of High Frequency Trading on an Electronic Market.” SSRN Electronic Journal.
Laidlaw, Bronstein, Guo, et al. 2025. AssistanceZero: Scalably Solving Assistance Games.” In Workshop on Bidirectional Human↔︎AI Alignment.
Lalitha, Javidi, and Sarwate. 2014. Social Learning and Distributed Hypothesis Testing.” arXiv:1410.4307 [Cs, Math, Stat].
Lee, Leibo, An, et al. 2022. Importance of prefrontal meta control in human-like reinforcement learning.” Frontiers in Computational Neuroscience.
Levin. 2019. The Computational Boundary of a “Self”: Developmental Bioelectricity Drives Multicellularity and Scale-Free Cognition.” Frontiers in Psychology.
Lian, Bisazza, and Verhoef. 2021. The Effect of Efficient Messaging and Input Variability on Neural-Agent Iterated Language Learning.”
Linial. 1994. Game-Theoretic Aspects of Computing.” In Handbook of Game Theory with Economic Applications.
Lopes, and Sayed. 2007. “Incremental Adaptive Strategies over Distributed Networks.” IEEE Transactions on Signal Processing.
———. 2008. “Diffusion Least-Mean Squares over Adaptive Networks: Formulation and Performance Analysis.” IEEE Transactions on Signal Processing.
Lowe, Foerster, Boureau, et al. 2019. On the Pitfalls of Measuring Emergent Communication.”
Lowe, Wu, Tamar, et al. 2020. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments.”
Lyons, and Levin. 2024. Cognitive Glues Are Shared Models of Relative Scarcities: The Economics of Collective Intelligence.”
Mann, and Helbing. 2016. Minorities Report: Optimal Incentives for Collective Intelligence.” arXiv:1611.03899 [Cs, Math, Stat].
Mateo, Horsevad, Hassani, et al. 2019. Optimal Network Topology for Responsive Collective Behavior.” Science Advances.
Meulemans, Kobayashi, Oswald, et al. 2024. Multi-Agent Cooperation Through Learning-Aware Policy Gradients.” In.
Meulemans, Nasser, Wołczyk, et al. 2025. Embedded Universal Predictive Intelligence: A Coherent Framework for Multi-Agent Learning.”
Mordvintsev, Randazzo, Niklasson, et al. 2020. Growing Neural Cellular Automata.” Distill.
Mu, Guo, Chen, et al. 2024. Multi-Agent, Human-Agent and Beyond: A Survey on Cooperation in Social Dilemmas.”
Navlakha, and Bar-Joseph. 2014. Distributed Information Processing in Biological and Computational Systems.” Communications of the ACM.
Ohsawa. 2021. Unbiased Self-Play.” arXiv:2106.03007 [Cs, Econ, Stat].
Olfati-Saber, R. 2005. Distributed Kalman Filter with Embedded Consensus Filters.” In 44th IEEE Conference on Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC ’05.
Olfati-Saber, R. 2006. “Flocking for Multi-Agent Dynamic Systems: Algorithms and Theory.” Automatic Control, IEEE Transactions on.
Olfati-Saber, R., Fax, and Murray. 2007. Consensus and Cooperation in Networked Multi-Agent Systems.” Proceedings of the IEEE.
Olfati-Saber, Reza, Franco, Frazzoli, et al. 2006. Belief Consensus and Distributed Hypothesis Testing in Sensor Networks.” In Networked Embedded Sensing and Control. Lecture Notes in Control and Information Science.
Oroojlooy, and Hajinezhad. 2023. A Review of Cooperative Multi-Agent Deep Reinforcement Learning.” Applied Intelligence.
Pan, Gao, Xie, et al. 2024. Very Large-Scale Multi-Agent Simulation in AgentScope.”
Peysakhovich, and Lerer. 2017. Prosocial Learning Agents Solve Generalized Stag Hunts Better Than Selfish Ones.”
Qi, Ban, and He. 2023. Graph Neural Bandits.” In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
Reid, O’Callaghan, Carroll, et al. n.d. “Risk Analysis Techniques for Governed LLM-Based Multi-Agent Systems.”
Ren, and Beard. 2005. “Consensus Seeking in Multiagent Systems Under Dynamically Changing Interaction Topologies.” Automatic Control, IEEE Transactions on.
Resnick, Gupta, Foerster, et al. 2020. Capacity, Bandwidth, and Compositionality in Emergent Language Learning.”
Rosser, and Foerster. 2025. AgentBreeder: Mitigating the AI Safety Impact of Multi-Agent Scaffolds via Self-Improvement.”
Samuelson. 2001. Analogies, Adaptation, and Anomalies.” Journal of Economic Theory.
Sayed, Ali. 2014. Adaptation, Learning, and Optimization over Networks.” Foundations and Trends® in Machine Learning.
Sayed, Ali H. 2014. Adaptive Networks.” Proceedings of the IEEE.
Sharma, Sugandha, Davidson, Khetarpal, et al. 2024. Toward Human-AI Alignment in Large-Scale Multi-Player Games.”
Sharma, Piyush K., Fernandez, Zaroukian, et al. 2021. Survey of Recent Multi-Agent Reinforcement Learning Algorithms Utilizing Centralized Training.” In Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications III.
Smith, and Price. 1973. The Logic of Animal Conflict.” Nature.
Spanos, Olfati-Saber, and Murray. 2005. “Dynamic Consensus on Mobile Networks.” In IFAC World Congress.
Stiglitz. 2006. The Contributions of the Economics of Information to Twentieth Century Economics.” The Quarterly Journal of Economics.
Suarez. 2024. Neural MMO: Massively Multiagent Simulation and Learning.”
Suárez, Du, Isola, et al. 2019. Neural MMO: A Massively Multiagent Game Environment for Training and Evaluating Intelligent Agents.”
Suárez, Isola, Choe, et al. 2023. “Neural MMO 2.0: A Massively Multi-Task Addition to Massively Multi-Agent Learning.”
Tan, and Abramsky. 2022. Institutions under composition.”
Tarai, and Bit, eds. 2021. Neurocognitive Perspectives of Prosocial and Positive Emotional Behaviours: Theory to Application.
Tennant, Hailes, and Musolesi. 2023. Modeling Moral Choices in Social Dilemmas with Multi-Agent Reinforcement Learning.” In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence.
Tumer, and Wolpert. 2004. “Coordination in Large Collectives- Chapter 1.” In.
Walters, Kaufmann, Sefas, et al. 2025. Free Energy Risk Metrics for Systemically Safe AI: Gatekeeping Multi-Agent Study.”
Wang, Shengdi, and Dekorsy. 2020. A Factor Graph-Based Distributed Consensus Kalman Filter.” IEEE Signal Processing Letters.
Wang, Hanlin, and Rubenstein. 2020. Shape Formation in Homogeneous Swarms Using Local Task Swapping.” IEEE Transactions on Robotics.
Wolpert, David H. 2006a. “Advances in Distributed Optimization Using Probability Collectives.” Advances in Complex Systems.
———. 2006b. Information Theory — The Bridge Connecting Bounded Rational Game Theory and Statistical Physics.” In Complex Engineered Systems. Understanding Complex Systems.
Wolpert, David H, Bieniawski, and Rajnarayan. 2011. “Probability Collectives in Optimization.”
Wolpert, David H, and Lawson. 2002. Designing Agent Collectives for Systems with Markovian Dynamics.” In.
Wolpert, David H., and Tumer. 1999. An Introduction to Collective Intelligence.” arXiv:cs/9908014.
Wolpert, David H, Wheeler, and Tumer. 1999. General Principles of Learning-Based Multi-Agent Systems.” In.
———. 2000. Collective Intelligence for Control of Distributed Dynamical Systems.” EPL (Europhysics Letters).
Wulfmeier, Ondruska, and Posner. 2016. Maximum Entropy Deep Inverse Reinforcement Learning.”
Yang, Luo, Li, et al. 2018. Mean Field Multi-Agent Reinforcement Learning.” In Proceedings of the 35th International Conference on Machine Learning.
Ye. 2008. A Path to the Arrow–Debreu Competitive Market Equilibrium.” Mathematical Programming.
Zhang, and Zhu. 2017. Game-Theoretic Design of Secure and Resilient Distributed Support Vector Machines with Adversaries.” arXiv:1710.04677 [Cs, Stat].
Zhao, and Sayed. 2014. Asynchronous Adaptation and Learning over Networks — Part I: Modeling and Stability Analysis.” arXiv:1312.5434 [Cs, Math].
Zivojevic, Delalic, Raca, et al. 2021. Distributed Weighted Least-Squares and Gaussian Belief Propagation: An Integrated Approach.” Preprint.