2 Mixed Motive Settings
Required Content 2hrs • All Content 3hrs 45mins
As mentioned in the introductory section, cooperative AI uses many tools and concepts from game theory.
In game theory, multi-agent settings can be classified into three categories:
- Fully cooperative: all agents have shared objectives, where what is “good” for one agent is always equally good for another agent
- Fully competitive: all agents have opposing objectives, where what is “good” for one agent is always bad for all the others (also referred to as zero-sum games)
- Mixed-motive: agents have some overlap in objectives but also some conflicting interests.
One way to frame what cooperative AI is is that it deals with AI agents in mixed-motive settings. Note that while the description of mixed-motive settings provided above might feel a bit abstract, virtually all interactions you have with other human beings could be described as mixed-motive. In a family, a sports team, a classroom or a workplace you will have different people with different goals that are not perfectly aligned with each other, but neither perfectly opposed.
By the end of the section, you should be able to:
- Categorise multi-agent settings into fully cooperative, fully competitive, and mixed-motive.
- Explain why mixed-motive settings are a primary focus of cooperative AI.
- Discuss the reasons why future AI agents might exhibit more "selfish" behaviors.
- Place various real-world scenarios on a scale from most cooperative to most competitive.
The following piece, “A Review of Cooperation in Multi-agent Learning”, introduces cooperation between learning agents and contrasts fully cooperative settings against mixed-motive settings with some real-world examples. Section 5 on Cooperation with mixed motivation goes into more technical detail and is optional reading.
Tooltip Text
1 Introduction, 3.3 Mixed-motive games
Tooltip Text
List 4 mixed-motive, 4 fully cooperative and 4 fully competitive multi-agent settings.
Tooltip Text
Can you think of an example of a multi-agent setting that you don't know which category to place in, or you think doesn't fall into any of the three categories?
Tooltip Text
5 Cooperation with Mixed Motivation
Tooltip Text
Consider the solution approaches presented in section 5 of ‘A Review of Cooperation in Multi-agent Learning’. Write a one-sentence summary of the idea behind each of these suggested solutions. Considering real-world cooperation problems among future AI-agents, which solution approach do you consider to be most promising?
Tooltip Text
The field of cooperative AI is primarily focused on mixed-motive settings, partly because most real-world interactions are best described as such, and partly because these settings give rise to particularly important and difficult-to-solve problems. In comparison, challenges related to having AI agents perform well in fully cooperative scenarios are more likely to be driven by commercial incentives. The following piece, section 2.2 of ‘Multi-Agent Risks from Advanced AI’, provides a more in-depth explanation of why mixed-motive interactions is such an important aspect of multi-agent AI safety.
Tooltip Text
A potential objection to the focus on mixed-motive settings is that we could just design agents to be friendly, altruistic and cooperative. Game theory is inspired by individualistic, human behaviour and there is no law of nature that dictates that AI agents must behave like humans. While this is certainly true, we can also recognise that there are clear financial incentives to develop AI agents that are not just friendly and helpful, but also strategic and robust to attempts for exploitation.
The following blog post from Anthropic recounts their experiment using Claude to run a small shop in their office. When you read it, reflect on how this situation compares to what you have learned about mixed-motive settings.
Tooltip Text
All parts
Tooltip Text
What kind of multi-agent setting is Claudius in? What behaviour of Claudius can you point to, that is atypical of the multi-agent setting that Claudius is in? If Claudius was finetuned to perform better in this scenario (according to the perspective of the researchers), how do you think it would behave differently?
Tooltip Text
While current LLMs are very accommodating and not very strategic, such that we are not currently observing any salient social dilemma type dynamics resulting in harmful outcomes, there are good reasons to think that this will change. The next piece of content outlines some of the reasons we might expect future AI agents to behave in a more “selfish” way:
Tooltip Text
As might be obvious at this point, “mixed motive interactions” encompasses a wide range of relatively varied dynamics, from relatively cooperative ones to ones where competitive forces are more dominant. You could for example imagine settings where the ultimate goals of all agents are almost fully aligned, but there exists incomplete information or information asymmetry that make their interactions take on mixed-motive characteristics rather than fully cooperative ones.
Tooltip Text
Place the following on a scale of most cooperative to most competitive:
- A street market of buyers and sellers.
- Robots fulfilling orders to move stock around a warehouse.
- Two groups of academic researchers each working on projects to find a cure to a particular disease.
- A company consisting of one manager and two employees.
- A group of people all fishing from the same lake, each to feed themselves and their family.
- Two friends who want to meet in town but can't communicate. Both would prefer to meet over not meeting, but each have slightly opposing preferences over which location they meet.
Tooltip Text
Roads of self-driving and human operated cars is an incredibly popular multi-agent system to study. Aside from saturation, why might the field of Cooperative AI not be that focussed on these multi-agent systems, considering what you've learnt in this section.
Tooltip Text
Consider one of the example multi-agent systems you have come across so far in this curriculum e.g. a group of people all fishing from the same lake, each to feed themselves and their family. Formulate it as a Markov game. You don't need to go into a huge amount of detail, and note that there is not one way to do this and your formulation will depend on the assumptions you make about the scenario. How have you or how would you model differences in the information available to each agent in the scenario (if you're unsure, search-terms like partially observable, or imperfect information might help)?
Tooltip Text
Try to develop a taxonomy that further subdivides multi-agent interactions beyond the three categories we've presented. It might be helpful to start from the standard roster of games from game theory and/or think about the scenarios you think sit in some intersection of the types we've presented.
Tooltip Text
As mentioned in the introduction to this section, the term “mixed-motive” comes from game theory. While game theory provides important tools for cooperative AI research, it is by no means the only approach for studying multi-agent AI safety. The next section will focus on a different perspective that instead draws more on tools from complex systems theory.
Tooltip Text
