Understanding the system-level properties and dynamics of populations of advanced AI agents is a foundational scientific challenge. It requires characterising how the properties of individual agents – their capabilities, objectives, and behavioural dispositions – contribute to population-level outcomes, as well as how the structure and dynamics of agent networks give rise to emergent vulnerabilities, failures, and collective behaviours. Of particular importance are cases where groups of agents form a ‘collective agent’, exhibiting coherent collective ‘goals’, strategies, or capabilities that are not predictable from individual systems in isolation. Without this scientific foundation, individual-level safeguards may fail to anticipate system-level risks, reducing our ability to pre-empt, forecast, or diagnose the impacts of larger-scale deployments. We are especially interested in work that combines theoretical insight with empirical evaluation in realistic multi-agent settings.
Please see the guidelines on research areas and out-of-scope topics.
Agrawal, Akash, Soroush Ebadian, and Lewis Hammond (2026). "The Multi-Agent Off-Switch Game". In Proceedings of the 25th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2026), Paphos, Cyprus, pp. 4–12.
Jones, Erik, Anca Dragan, and Jacob Steinhardt (2025). "Adversaries Can Misuse Combinations of Safe Models". In Proceedings of the 42nd International Conference on Machine Learning, pp. 28327–28349.
Jørgensen, Frederik Hytting, Sebastian Weichwald, and Lewis Hammond (2025). "Causal Foundations of Collective Agency". arXiv:2605.00248 (CLeaR 2026).
Lee, Donghyun, and Mo Tiwari (2024). "Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems". arXiv:2410.07283.
Motwani, Sumeet Ramesh, Mikhail Baranchuk, Martin Strohmeier, Vijay Bolina, Philip H. S. Torr, Lewis Hammond, and Christian Schroeder de Witt (2024). "Secret Collusion among AI Agents: Multi-Agent Deception via Steganography". In Advances in Neural Information Processing Systems 37 (NeurIPS 2024).
Szabo, Claudia, and Yong Meng Teo (2015). "Formalization of Weak Emergence in Multiagent Systems". ACM Transactions on Modeling and Computer Simulation (26:1), Article 6, pp. 1–25.
Tilli, Cecilia Elena (2026). "Agent Properties for Multi-Agent Safety". ICLR 2026 Workshop on AI Agents in the Wild (AIWILD).