Fellows' Spotlight: Niklas Lauffer

Tell us about yourself?

I got my research start as an undergraduate at UT Austin, where I was fortunate to work with Ufuk Topcu on topics surrounding safety in autonomous systems. That experience introduced me to the technical challenges around building AI systems that can act reliably in complex, uncertain environments, and also gave me my first exposure to multi-agent systems. I then came to UC Berkeley for my PhD, where I worked with Stuart Russell and Sanjit Seshia at the Center for Human-Compatible AI. Throughout my PhD, my research has focused on how to train AI agents that are safe, robust, and cooperative in open-ended interactions with humans, other AI systems, and unknown actors. Throughout my research career I had the opportunity to intern at NASA Ames Research Center and Scale AI. After finishing my PhD, I’ll be joining Google DeepMind as a Research Scientist, where I’m excited to continue working on agentic safety.

‍‍

What problem are you trying to solve in your research, and what impact do you think it’ll have?

My research focuses on AI safety, especially in open-ended, multi-agent interactions. I think a lot about robustness in multi-agent systems: how agents fail when other agents’ behaviour becomes out of distribution, and how we can better detect these failures before they happen. As AI systems become more agentic, they will increasingly need to coordinate, negotiate, and interact with humans and other AI systems over long horizons. I hope my work can help make these interactions safer and more reliable, particularly in high-stakes settings where failures could have a cascading effect within a multi-agent system.

‍

‍What have you enjoyed most about the Cooperative AI PhD Fellowship?

The best part of the Cooperative AI PhD Fellowship has been the community. Cooperative AI is still a relatively small subfield, so it can be challenging to find people who are thinking about similar problems and working on similar technical approaches. The fellowship cohort has been amazing: we would meet up at conferences, stay in touch over Zoom, and share ideas and updates about the projects we’re working on. Having that intellectual community supported my research and gave me an outlet for receiving feedback and new ideas.

‍

What would you say to someone considering a career in cooperative AI but unsure where to start?

As with any research area, finding mentors and collaborators is incredibly important. This can be challenging in cooperative AI because the subfield is still relatively small, so you often need to make an active effort to find the right people. Established communities like CAIF are a great place to start, but a thoughtfully written email to a graduate student or early-career researcher working in the area can also go a long way. I would encourage people to reach out generously to build their network and focus on concrete projects where they can start contributing.

‍

What are your plans for the future after you finish your PhD?

After finishing my PhD, I’ll be joining Google DeepMind as a Research Scientist working on agentic safety: enforcing safety guarantees on LM agents that interact with and take actions in the real world. I see multi-agent interaction as a critical component of agentic safety, since safety in the real world necessitates interaction with humans and other agents. I’m especially excited about the opportunity to apply my AI safety research to systems used by billions of people around the world.

‍

What do you think are some of the most important things to work on in cooperative AI in the next few years?

I think cooperative AI could have a huge impact by helping move the broader field beyond the static benchmarks that currently dominate AI evaluation, including safety evaluation. Static evaluations are too easily gamified, optimised towards, and disconnected from the unpredictability of real-world conditions. Cooperative AI has always emphasised that to understand multi-agent risks, we need to study dynamic, evolving behaviour in systems of interacting agents. Expanding this perspective and bringing its lessons to the rest of AI safety to build dynamic evaluation could be one of the field’s highest-impact contributions in the next few years.

‍

July 2, 2026

Natàlia Fernández Ashman