5 Cooperative Infrastructure

Required Content 3hrs 45mins • All Content 6hrs 55mins

Humans in many ways excel at cooperation. In addition to our biological “built-in” prosociality, humans have also evolved complex and sophisticated mechanisms that underpin our cooperative abilities, such as:

  • Reward systems (recognising and incentivising good cooperators)
  • Social norms and rules (both formal laws and informal expectations)
  • Reputation systems (we remember and share information about who can be trusted)
  • Punishment mechanisms (from social shaming to legal consequences)

These mechanisms are not properties of individuals, but can instead be thought of as different kinds of cooperative infrastructure or institutions.

By the end of the section, you should be able to:

  • Explain what is meant by centralised and decentralised cooperative infrastructure.
  • Identify examples of cooperative infrastructure in human societies and discuss how similar infrastructure could be applied to multi-agent AI systems.
  • Explain and provide examples of mechanisms which help enforce norms in a multi-agent system.
  • Describe how systems for agent identification and reputation could support cooperation among AI-Human multi-agent systems.

The next resource, section 4.4 of Open Problems in Cooperative AI, introduces this important sub-area of cooperative AI research which deals with cooperative infrastructure for AI agents. Note that the words infrastructure and institutions are used somewhat interchangeably in this context.

This kind of work is often inspired by human cooperation, so if you want to gain a bit more background on cooperative infrastructure in human societies, you might also want to read the optional resource which compares human cooperation to that of other natural species.

Tooltip Text

Open Problems in Cooperative AI

4.4. Institutions

Tooltip Text

Required • 30mins
How is human cooperation different? - PMC

All parts

Tooltip Text

Optional • 1hr 20mins

The distinction between centralised and decentralised infrastructure is important. Centralised infrastructure can often be more tangible and easier to grasp; in human societies this could be exemplified by a system of laws and law enforcement or a religious organisation with clear leadership and organisational hierarchies. These are typically things that people associate with the word institutions.

Tooltip Text

Exercise

List some examples of centralised and decentralised infrastructure used to improve cooperation among humans. Try to explain what each infrastructure is enabling, e.g. does it enable coordination, compromise, or credible commitment.

Tooltip Text

Required • 15mins

This takes us to a concept from economics research: mechanism design. The next resource explains how mechanism design theory deals with the engineering of mechanisms - or infrastructure - that make (strategic, economic) agents behave in a way that generates desirable outcomes.

Tooltip Text

What is mechanism design

All parts

Tooltip Text

Prerequisites
Required • 20mins

While the video focuses on mechanism design in the context of economic theory for human societies, mechanism design can also be applied to a system of AI agents to make them behave in a desired way - for example, to achieve sustainable cooperative behaviour. An important feature here is that mechanism design is top down and assumes that the mechanism designer has sufficient control over the agents’ environment to “set the rules of the game”.

Tooltip Text

Exercise

There are different ways to measure aggregate welfare in a system of agents that you might want to use to compare different centralised infrastructure or mechanisms against. For example, there’s utilitarian welfare which equals the sum of all the individuals’ welfare, and egalitarian welfare which equals the minimum individual welfare in the system. Assuming we can approximate the welfare of each individual, what are some other measures of welfare you can think of? Do you have a preferred measure, and why?

Tooltip Text

Required • 15mins
Exercise

Imagine there are 3 candidates, A, B and C, and 3 voters, 1, 2 and 3. Voters cast their votes by ranking the candidates, e.g. A > B > C, and then the winner is decided as follows: for each candidate, count up the number of candidates they are preferred to across the voters e.g. for the votes A > B > C and C > A > B, A is preferred to 3 candidates in total; the winning candidate is the one with the highest number of candidates they are preferred to; in the event of a tie, the candidate highest in alphabetical order is selected e.g. if A and C tie, then A wins. A voting rule is strategyproof if each player is not strictly incentivised to cast a vote which does not truthfully represent their preferences given knowledge of the votes that others will cast e.g. if voter 3 knows what orderings voters 1 and 2 will cast, among voter 3's best options is to cast their true preference order for the candidates. By way of a counterexample, show that the voting rule presented here is not strategyproof?

Tooltip Text

Optional • 10mins

Notice the parallel between the mechanism design and opponent shaping, which was introduced in the previous section. Both of these are approaches for shaping the policies of agents, but they rely on different levels of control - in opponent shaping, the influence is exerted only through the action space of the shaping agent, while mechanism design relies on some significant level of control over the environment.

The next resource introduces adaptive mechanism design as an approach for promoting cooperation among AI agents.

Tooltip Text

Adaptive Mechanism Design: Learning to Promote Cooperation

'Abstract', 'Introduction' and 'Related Work'

Tooltip Text

Required • 15mins

If adaptive mechanism design could be scaled up to important real world settings - large numbers of complex agents in complex environments - this could be a way to avoid cooperation failures. A problem with this kind of approach however, is that it requires an environment that has some kind of centralised control. Furthermore, even when such an entity exists, it might not always be motivated to implement cooperation-promoting mechanisms.

Tooltip Text

Exercise

Social media platforms are environments where there is an entity in control, i.e. the company that owns the platform. How do they typically approach mechanism design? Do they implement mechanisms that promote cooperation–why, or why not?

Tooltip Text

Required • 10mins

The next couple of readings demonstrate examples of centralised cooperative infrastructure. The first one proposes a mediator that can optimise social welfare and the second builds on the idea of formal contracting from economics to overcome diverging incentives.

Tooltip Text

Mediated Multi-Agent Reinforcement Learning

'Abstract' and 'Introduction'

Tooltip Text

Required • 15mins

If you are interested to see how mediators are formalised then we recommend the section ‘Problem setup’ of the above resource.

Tooltip Text

Mediated Multi-Agent Reinforcement Learning

'Abstract' and 'Introduction'

Tooltip Text

Prerequisites
Optional • 15mins
Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL

Introduction

Tooltip Text

Prerequisites
Required • 10mins

The following optional piece experimentally tests ideas from ‘Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL’ on LLM-agents.

Tooltip Text

Mitigating Generative Agent Social Dilemmas

'Introduction' and 'Discussion'

Tooltip Text

Prerequisites
Optional • 10mins
Exercise

In which real-world settings could contracts or mediators work and in which might they fail or be impractical to implement?

Tooltip Text

Required • 10mins

As noted, the ability to actually implement centralised cooperative infrastructure is a key crux for the impact of such solutions, and this is an important reason to also explore decentralised solutions. A key concept here - again inspired by studies of human cooperation - is social norms and normative systems. The next piece discusses definitions of normative multi-agent systems and introduces relevant research questions.

Tooltip Text

Introduction to Normative Multiagent Systems

All parts

Tooltip Text

Required • 30mins
Exercise

It is well established that the rational choice in a single round of prisoner’s dilemma is to defect and confess the crime, but imagine now a version where the two prisoners are members of a criminal community where there is a strong social norm against such behaviour. This changes things–the prisoners are much more likely to cooperate (stay silent) in this scenario. How can this be understood as a rational behaviour if both prisoners are egotistical agents optimising only for their own benefit? What do you think would happen if one of them chose to defect? Write out a more representative payoff matrix for this situation given the context.

Tooltip Text

Required • 10mins

As the previous resource noted, “Norms are not necessarily created by a single legislator, they can also emerge spontaneously, or be negotiated among the agents” - this is another way of saying that norms can be established in a decentralised way. While this makes such systems an interesting alternative to centralised approaches, one might wonder how such norms could actually emerge or be established among AI agents. In the remainder of this section we’ll look at the enforcement of decentralised norms, and infrastructure that could support this, and see an example of LLM-agents learning to utilise indirect reciprocity to enforce cooperative norms.

Norms always rely on enforcement, typically through different kinds of decentralised sanctioning of norm-breaking behaviour. Remember how cooperation can become sustainable in a prisoner’s dilemma if the game is repeated, as it is possible to punish the opponent for defections. Sanctioning can be thought of as a crowd-sourcing of such punishment, where the interaction might not be repeated with the same specific individual but rather with different individuals of the same group.

In the following paper agents can see a history of all the sanctioning events and they each learn their own classifier to judge behaviors as “approved” or “disapproved.”

Tooltip Text

A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings

Introduction

Tooltip Text

Required • 15mins

We have previously noted the parallels between opponent shaping, where one agent shapes the learning of another agent, and mechanism design, where the mechanism designer shapes the learning and behaviour of one or many agents in an environment. Decentralised norm enforcement can be thought of as a third variation; here, the group collectively shapes the learning of the members.

The next resource is a paper on the emergence of cooperative infrastructure among LLM agents, connecting normative systems to evolutionary game theory. In this work, authors show how Claude-based agents manage to successfully utilise information given to them about the past behaviour of others and eventually start to exhibit indirect reciprocity.

Tooltip Text

Cultural Evolution of Cooperation among LLM Agents

'Introduction' and 'Discussion'

Tooltip Text

Prerequisites
Required • 20mins
Exercise

Think of an example of a norm among humans and answer the following questions: where did the norm come from; how is it represented (formally or otherwise) in the society it is active; how is the behaviour of people evaluated for adherence to the norm; what are the consequences for not adhering to the norm.

Tooltip Text

Required • 15mins
Exercise

Norms are not fixed in human societies. Describe, with an example, a process by which a norm might be revised or removed. This can be a high stakes example among thousands of people, or a low stakes example among friends.

Tooltip Text

Required • 10mins

Most studies on norms, especially in AI, are focussed on norms that are very clearly beneficial for cooperation or coordination. The following optional resource makes a case for the functionality of norms that are ostensibly redundant to the overall fitness of a group.

Tooltip Text

Legible Normativity for AI Alignment: The Value of Silly Rules

"Abstract', 'Introduction', 'What are Silly Rules? A Thought Experiment from Ethnography' and 'Discussion'"

Tooltip Text

Optional • 30mins

Effective norm enforcement can be supported by complementary, centralised infrastructure. Reliable systems for agent ID’s have for example been proposed as a solution that could make multi-agent interactions significantly safer, as they would make reputation systems and norm enforcement more tractable. A system for agent ID’s could be facilitated centrally, e.g. by governments, and then used for decentralised enforcement of norms e.g. by agents avoiding interactions with those that don’t have a verifiable ID, or with those who have a history of bad behaviour.

The final resource of this section is a short talk on why agent infrastructure in general and agent ID’s in particular could be important for AI safety. This talk takes a broader safety perspective, not only focusing on multiagent safety. If you want to go more in depth, we also include a paper by the same author as an optional resource.

Tooltip Text

You Should Work on Agent Infrastructure

All parts

Tooltip Text

Required • 5mins
IDs for AI Systems

All parts

Tooltip Text

Required • 5mins
Exercise

There are three main types of approaches to agent training and deployment, the terminology for which stems from Multi-agent Reinforcement Learning: centralised training and execution, centralised training for decentralised execution, and decentralised training and execution. (For an explanation of these categories, we would recommend page 2 of An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning.) You were introduced to various training schemes in the last chapter which fall predominantly in the latter two categories. What are the key technical challenges in setting up truly centralised training programs among AI agents produced by multiple different companies?

Tooltip Text

Required • 20mins