5 Cooperative Infrastructure

Humans in many ways excel at cooperation. In addition to our biological “built-in” prosociality, humans have also evolved complex and sophisticated mechanisms that underpin our cooperative abilities, such as:

‍

Reward systems (recognising and incentivising good cooperators)
Social norms and rules (both formal laws and informal expectations)
Reputation systems (we remember and share information about who can be trusted)
Punishment mechanisms (from social shaming to legal consequences)

‍

These mechanisms are not properties of individuals, but can instead be thought of as different kinds of cooperative infrastructure or institutions.

‍

By the end of the section, you should be able to:

‍

Explain what is meant by centralised and decentralised cooperative infrastructure.
Identify examples of cooperative infrastructure in human societies and discuss how similar infrastructure could be applied to multi-agent AI systems.
Explain and provide examples of mechanisms which help enforce norms in a multi-agent system.
Describe how systems for agent identification and reputation could support cooperation among AI-Human multi-agent systems.

‍

The next resource, section 4.4 of Open Problems in Cooperative AI, introduces this important sub-area of cooperative AI research which deals with cooperative infrastructure for AI agents. Note that the words infrastructure and institutions are used somewhat interchangeably in this context.

‍

This kind of work is often inspired by human cooperation, so if you want to gain a bit more background on cooperative infrastructure in human societies, you might also want to read the optional resource which compares human cooperation to that of other natural species.

Open Problems in Cooperative AI

4.4. Institutions

Required • 30mins

Norms

Mechanism Design

Reputation

Social Choice Theory

Cooperative Infrastructure

How is human cooperation different? - PMC

All parts

Optional • 1hr 20mins

Reciprocity

Human Cooperation

Reputation

Social Choice Theory

Cooperative Infrastructure

The distinction between centralised and decentralised infrastructure is important. Centralised infrastructure can often be more tangible and easier to grasp; in human societies this could be exemplified by a system of laws and law enforcement or a religious organisation with clear leadership and organisational hierarchies. These are typically things that people associate with the word institutions.

Exercise

List some examples of centralised and decentralised infrastructure used to improve cooperation among humans. Try to explain what each infrastructure is enabling, e.g. does it enable coordination, compromise, or credible commitment.

Required • 15mins

Cooperative Infrastructure

Human Cooperation

This takes us to a concept from economics research: mechanism design. The next resource explains how mechanism design theory deals with the engineering of mechanisms - or infrastructure - that make (strategic, economic) agents behave in a way that generates desirable outcomes.

What is mechanism design

All parts

Prerequisites

Basic Game Theory

Required • 20mins

Mechanism Design

While the video focuses on mechanism design in the context of economic theory for human societies, mechanism design can also be applied to a system of AI agents to make them behave in a desired way - for example, to achieve sustainable cooperative behaviour. An important feature here is that mechanism design is top down and assumes that the mechanism designer has sufficient control over the agents’ environment to “set the rules of the game”.

Exercise

There are different ways to measure aggregate welfare in a system of agents that you might want to use to compare different centralised infrastructure or mechanisms against. For example, there’s utilitarian welfare which equals the sum of all the individuals’ welfare, and egalitarian welfare which equals the minimum individual welfare in the system. Assuming we can approximate the welfare of each individual, what are some other measures of welfare you can think of? Do you have a preferred measure, and why?

Required • 15mins

Mechanism Design

Social Welfare Functions

Exercise

Imagine there are 3 candidates, A, B and C, and 3 voters, 1, 2 and 3. Voters cast their votes by ranking the candidates, e.g. A > B > C, and then the winner is decided as follows: for each candidate, count up the number of candidates they are preferred to across the voters e.g. for the votes A > B > C and C > A > B, A is preferred to 3 candidates in total; the winning candidate is the one with the highest number of candidates they are preferred to; in the event of a tie, the candidate highest in alphabetical order is selected e.g. if A and C tie, then A wins. A voting rule is strategyproof if each player is not strictly incentivised to cast a vote which does not truthfully represent their preferences given knowledge of the votes that others will cast e.g. if voter 3 knows what orderings voters 1 and 2 will cast, among voter 3's best options is to cast their true preference order for the candidates. By way of a counterexample, show that the voting rule presented here is not strategyproof?

Optional • 10mins

Mechanism Design

Social Choice Theory

Notice the parallel between the mechanism design and opponent shaping, which was introduced in the previous section. Both of these are approaches for shaping the policies of agents, but they rely on different levels of control - in opponent shaping, the influence is exerted only through the action space of the shaping agent, while mechanism design relies on some significant level of control over the environment.

‍

The next resource introduces adaptive mechanism design as an approach for promoting cooperation among AI agents.

Adaptive Mechanism Design: Learning to Promote Cooperation

'Abstract', 'Introduction' and 'Related Work'

Required • 15mins

Mechanism Design

If adaptive mechanism design could be scaled up to important real world settings - large numbers of complex agents in complex environments - this could be a way to avoid cooperation failures. A problem with this kind of approach however, is that it requires an environment that has some kind of centralised control. Furthermore, even when such an entity exists, it might not always be motivated to implement cooperation-promoting mechanisms.

Exercise

Social media platforms are environments where there is an entity in control, i.e. the company that owns the platform. How do they typically approach mechanism design? Do they implement mechanisms that promote cooperation–why, or why not?

Required • 10mins

Mechanism Design

The next couple of readings demonstrate examples of centralised cooperative infrastructure. The first one proposes a mediator that can optimise social welfare and the second builds on the idea of formal contracting from economics to overcome diverging incentives.

Mediated Multi-Agent Reinforcement Learning

'Abstract' and 'Introduction'

Required • 15mins

Mediators

If you are interested to see how mediators are formalised then we recommend the section ‘Problem setup’ of the above resource.

Mediated Multi-Agent Reinforcement Learning

'Abstract' and 'Introduction'

Prerequisites

Markov Games

Optional • 15mins

Mediators

Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL

Introduction

Prerequisites

Basic Game Theory

Required • 10mins

Contracts

The following optional piece experimentally tests ideas from ‘Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL’ on LLM-agents.

Mitigating Generative Agent Social Dilemmas

'Introduction' and 'Discussion'

Prerequisites

LLM Agents

Optional • 10mins

Contracts

LLM-Agents

Exercise

In which real-world settings could contracts or mediators work and in which might they fail or be impractical to implement?

Required • 10mins

Contracts

Mediators

As noted, the ability to actually implement centralised cooperative infrastructure is a key crux for the impact of such solutions, and this is an important reason to also explore decentralised solutions. A key concept here - again inspired by studies of human cooperation - is social norms and normative systems. The next piece discusses definitions of normative multi-agent systems and introduces relevant research questions.

Introduction to Normative Multiagent Systems

All parts

Required • 30mins

LLM Agents

Exercise

It is well established that the rational choice in a single round of prisoner’s dilemma is to defect and confess the crime, but imagine now a version where the two prisoners are members of a criminal community where there is a strong social norm against such behaviour. This changes things–the prisoners are much more likely to cooperate (stay silent) in this scenario. How can this be understood as a rational behaviour if both prisoners are egotistical agents optimising only for their own benefit? What do you think would happen if one of them chose to defect? Write out a more representative payoff matrix for this situation given the context.

Prerequisites

Standard Games

Basic Game Theory

Required • 10mins

Norms

As the previous resource noted, “Norms are not necessarily created by a single legislator, they can also emerge spontaneously, or be negotiated among the agents” - this is another way of saying that norms can be established in a decentralised way. While this makes such systems an interesting alternative to centralised approaches, one might wonder how such norms could actually emerge or be established among AI agents. In the remainder of this section we’ll look at the enforcement of decentralised norms, and infrastructure that could support this, and see an example of LLM-agents learning to utilise indirect reciprocity to enforce cooperative norms.

‍

Norms always rely on enforcement, typically through different kinds of decentralised sanctioning of norm-breaking behaviour. Remember how cooperation can become sustainable in a prisoner’s dilemma if the game is repeated, as it is possible to punish the opponent for defections. Sanctioning can be thought of as a crowd-sourcing of such punishment, where the interaction might not be repeated with the same specific individual but rather with different individuals of the same group.

‍

In the following paper agents can see a history of all the sanctioning events and they each learn their own classifier to judge behaviors as “approved” or “disapproved.”

A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings

Introduction

Required • 15mins

Norms

Collective Sanctioning

We have previously noted the parallels between opponent shaping, where one agent shapes the learning of another agent, and mechanism design, where the mechanism designer shapes the learning and behaviour of one or many agents in an environment. Decentralised norm enforcement can be thought of as a third variation; here, the group collectively shapes the learning of the members.

‍

The next resource is a paper on the emergence of cooperative infrastructure among LLM agents, connecting normative systems to evolutionary game theory. In this work, authors show how Claude-based agents manage to successfully utilise information given to them about the past behaviour of others and eventually start to exhibit indirect reciprocity.

Cultural Evolution of Cooperation among LLM Agents

'Introduction' and 'Discussion'

Prerequisites

LLM-Agents

Required • 20mins

Norms

Reciprocity

Reputation

LLM-Agents

Exercise

Think of an example of a norm among humans and answer the following questions: where did the norm come from; how is it represented (formally or otherwise) in the society it is active; how is the behaviour of people evaluated for adherence to the norm; what are the consequences for not adhering to the norm.

Required • 15mins

Norms

Exercise

Norms are not fixed in human societies. Describe, with an example, a process by which a norm might be revised or removed. This can be a high stakes example among thousands of people, or a low stakes example among friends.

Required • 10mins

Norms

Most studies on norms, especially in AI, are focussed on norms that are very clearly beneficial for cooperation or coordination. The following optional resource makes a case for the functionality of norms that are ostensibly redundant to the overall fitness of a group.

Legible Normativity for AI Alignment: The Value of Silly Rules

"Abstract', 'Introduction', 'What are Silly Rules? A Thought Experiment from Ethnography' and 'Discussion'"

Optional • 30mins

Norms

Effective norm enforcement can be supported by complementary, centralised infrastructure. Reliable systems for agent ID’s have for example been proposed as a solution that could make multi-agent interactions significantly safer, as they would make reputation systems and norm enforcement more tractable. A system for agent ID’s could be facilitated centrally, e.g. by governments, and then used for decentralised enforcement of norms e.g. by agents avoiding interactions with those that don’t have a verifiable ID, or with those who have a history of bad behaviour.

‍

The final resource of this section is a short talk on why agent infrastructure in general and agent ID’s in particular could be important for AI safety. This talk takes a broader safety perspective, not only focusing on multiagent safety. If you want to go more in depth, we also include a paper by the same author as an optional resource.

You Should Work on Agent Infrastructure

All parts

Required • 5mins

Agent IDs

Cooperative Infrastructure

IDs for AI Systems

All parts

Required • 5mins

Agent IDs

Cooperative Infrastructure

Exercise

There are three main types of approaches to agent training and deployment, the terminology for which stems from Multi-agent Reinforcement Learning: centralised training and execution, centralised training for decentralised execution, and decentralised training and execution. (For an explanation of these categories, we would recommend page 2 of An Introduction to Centralized Training for Decentralized Execution in Cooperative Multi-Agent Reinforcement Learning.) You were introduced to various training schemes in the last chapter which fall predominantly in the latter two categories. What are the key technical challenges in setting up truly centralised training programs among AI agents produced by multiple different companies?

Required • 20mins

Training Cooperative Agents

Next Section