We organise this call into four sections, corresponding to the following research clusters:
- Sandboxes and Testbeds address the first major bottleneck: without realistic, reproducible multi-agent environments, progress on the remaining sections is hard to evaluate or compare.
- The Science of Agent Networks focuses on the safety-relevant properties of interacting agent populations: how collective capabilities emerge and scale, how networks of agents fail or become volatile, and how dangerous population-level properties can be detected.
- Strengthening Agent Infrastructure concerns the evaluation and stress-testing of the technical primitives – identity, verifiability, reputation, communication, commitment – on which trustworthy multi-agent interactions will depend.
- Multi-Agent Oversight and Control covers the detection, attribution, security, and intervention methods needed to keep deployed agent populations safe at scale.
We expect work in the latter clusters to build on work in the former clusters. Proposals may therefore target one cluster or span several, but we will prioritise those that target depth rather than breadth. In particular, we stress the importance of realistic sandboxes and testbeds (Section 1) for enabling scientific progress across the broader agenda. Where appropriate, we also encourage collaborations between teams addressing Section 1 with those addressing other sections, and welcome suggestions or requests for sandboxes and testbeds from those submitting proposals under other sections.