Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

Game Theory

Felipe Garrido-Lucero

Université Paris Dauphine-PSL

This course will be developed in 12 lectures plus 12 sessions of practical work of 1h30 each.

Program
1. Introduction to Game Theory:

ˆ Strategic interactions and applications of game theory


ˆ Notations and basic concepts

2. Non-cooperative games: Zero-sum games

ˆ Mixed extension
ˆ Von Neumann minmax theorem

3. Non-cooperative games: Non zero-sum games

ˆ Nash equilibrium
ˆ Complexity aspects of Nash equilibria

4. Potential games

ˆ Best-reply dynamic
ˆ Congestion games
ˆ Price of anarchy and Price of stability in congestion games

5. Repeated games

References
ˆ Mathematical foundations of Game Theory - Laraki, Renault, and Sorin [Book]

ˆ Algorithmic Game Theory - Nisan, Roughgarden, Tardo, and Vazirani [Book]

ˆ [Exercises] Strategy and Game Theory - Muñoz-Garcı́a and Toro-González [Book]

ˆ [Nash equilibrium complexity] Nash and Correlated Equilibria - Gilboa and Zemel [Paper]

1
1 Introduction to game theory
Game theory models situations where several entities (players) make choices and where this
collections of individual behaviors induces an outcome that affects all of them.
Example: Waking up in the morning for using the shower.
Since agents may evaluate these outcomes differently, game theory is also the study of rational
behavior within this framework.

1.1 A bit of history


The foundations of game theory starts in 1912 with Zermelo, who studied finite games with
perfect information. Years later, in 1921, Borel continued with the study of strategies and
information. Finally, in 1928 Von Neumman added his minmax theorem.
From this point several contributions were done, being specially important the work of Nash in
1950, who studied the definition and existence of ”strategic equilibrium”.
Since then, the theory of games has developed in several directions:

ˆ Games with infinite action spaces


ˆ Continuum of players
ˆ Dynamic games
ˆ Differential games
ˆ Games with incomplete information
ˆ Stochastic games

In this course, besides studying the mathematical foundations, we will work with the the first
three directions mentioned above.
Game theory has been successfully applied to numerous disciplines beyond mathematics: eco-
nomics, biology, computer sciences, operations research, political sciences, etc. In particular,
several game theorists have received the “Nobel prize in economics”. Regarding computing sci-
ences, the field of algorithmic game theory has been widely developed the last years, concerning
mainly the complexity of finding and/or implementing an equilibrium.
A game can be usually described by two ways: the extensive form and the normal form. The
first one involves explaining explicitly the “rules of the game” i.e. to say which player plays
and when, which information she receives during the play, what set of choices she has at her
disposal and what the consequences of a play are until a terminal outcome. The second one is
a more abstract way of defining a game as it consists in introducing the set of strategies of each
player and a mapping that associates to every profile of strategies the corresponding outcome,
the one determines a payoff for each player.
In any case, once defined a game, we will focus in two aspects:

1. The possible solutions of a game as equilibria or best replies.


2. The properties of these solutions.

1.2 Examples
We start checking some classical examples of game theory.

2
1.2.1 Stable matchings
Consider two sets of agents I and J (students/universities, doctors/hospitals), both of the same
size. In addition, each agent has a strict order of preferences over the agents on the other set.
The problem is to find a matching µ, an assignation (bijection) between the agents of both sets,
such that:

∄ (i, j) ∈ I × J, not matched between them such thatj > µi and i > µj
i j

Notice that this condition does not say that every agent is with his most preferred option, but
every time that an agent a prefers b over his partner in the matching, b does not prefer a as b
is with somebody that prefers more.

1.2.2 Bargaining problem


Imagine a cake C = [0, 1] of volume 1 unit to be shared between two friends. There are many
rules we can define for splitting the cake:

1. One of the players cuts the cake, and the second player chooses one of the pieces.
2. The first player offers a piece of cake to the second one. If the second player rejects the offer,
both players win 0 and the cake is lost.
3. A third person starts to move a knife from 0 to 1. The first player to say stop, gets the piece
from 0 until the point the knife arrived, and the second player obtains the rest of the cake.

1.2.3 Auctions
A house (or any indivisible object) is proposed for auction. n players want the object, each of
them with a different valuation vi ∈ R. If an agent gets the object at price p, his payoff is vi − p.
Again, we have many rules for defining the auction,

1. The price of the good starts to decrease in a screen. The first player who says “stop” gets
the good and pays the current price.
2. Agents starts to announce increasing prices. The one who proposes the highest price gets
the good and pays the announced price.
3. Players write in a paper the bid (the price) they offer by the good. The person who makes
the highest bid is the winner. Here, we can split in two subcases:
3.1. The winner pays his bid.
3.2. The winner pays the second highest bid.

1.2.4 Condorcet’s paradox


Three players a, b, c have to vote for one of three candidates A, B, C. Agent a ranks the can-
didates A > B > C, agent b ranks the candidates B > C > A, and agent c ranks the candidates
C > A > B. Then, if all the players vote honestly, we cannot find a majority winner.

1.3 Games in strategic form


Normally a game will consider a set of players N := {1, ..., n} (finitely many players), such that
each of the players is endowed with a set of strategies Si , ∀i ∈ N .
Q
Definition 1. We define the set of strategy profiles as S := i∈N Si = S1 × ... × Sn . A common
notation for s ∈ S will be s = (s1 , ..., sn ) = (si , s−i ) where s−i := (sj )j̸=i .

3
In addition, each player is given a payoff function gi : S → R, the one depends on the strategies
of all players. We will use the notation gi (s) := gi (si , s−i ). With this, a game G will be denoted
G := (N, (Si )i∈N , (gi )i∈N ).
Intuitively, in a game all players choose an action simultaneously (this does not mean that
players choose at the same time, but they do not have know the other chosen actions when they
choose theirs) seeking to maximize their payoff. Once s ∈ S is materialized, each player i gets
gi (s).
We make the assumption that all players know the game (or its rules), meaning that players
play a game with complete information.

Example 1 (Rock, paper, scissors). There are two players with S1 = S2 = {R, P, S}. We
represent the game with a matrix.

R P S
R 0,0 -1,1 1,-1
P 1,-1 0,0 -1,1
S -1,1 1,-1 0,0

Table 1: Rock, paper, scissor game

A matrix like above is called a payoff matrix. Player 1 is represented by the rows, while player
2 is represented by the columns. At every cell, the first number represents the payoff of player
1, g1 (s), while the second number corresponds to the payoff of player 2, g2 (s).
Therefore, if player 1 plays rock and player 2 plays paper, the payoff is respectively (−1, 1),
meaning that player 2 wins.

Example 2 (Prisoner’s dilemma). Two prisoners are interrogated separately about a robbery.
Each of them has the option of accusing his partner or remaining in silent. The following matrix
payoff shows the reduction of years of imprisonment that each prisoner may have.

A S
A 1,1 4,0
S 0,4 3,3

Table 2: Prisoner’s dilemma

Each prisoner has an incentive to accuse his partner, as independently of what the other one
does, the prisoner decreases his sentence. As both think the same, they mutually accuse each
other, decreasing in only one year their sentence. Could they do better ? What prevent them
from obtaining a lower sentence ?

Definition 2. Let G = (N, (Si )i∈N , (gi )i∈N ) be a game. We say that a strategy s∗i ∈ Si is

ˆ a strictly dominant strategy for player i if for any si ̸= s∗i and any s−i , gi (s∗i , s−i ) > gi (si , s−i ).
In words, s∗i is a strictly dominant strategy if independently of what the other players play,
i always obtain a higher payoff than playing something else.
ˆ a weakly dominant strategy for player i if for any si ̸= s∗i and any s−i , gi (s∗i , s−i ) ≥ gi (si , s−i ).
ˆ equivalent to si if gi (s∗i , s−i ) = gi (si , s−i ) for any s−i .

Can we find strictly or weakly dominant strategies for a player in our two previous examples ?

4
Definition 3. A strategy profile s ∈ S is Pareto optimal if there is not s′ ∈ S such that
gi (s′ ) ≥ gi (s), ∀i ∈ N and ∃i ∈ N, gi (s′ ) > gi (s). Otherwise, we say that s′ Pareto dominates s.

The intuition of a Pareto optimal strategy is that we cannot strictly increase the payoff of an
player without decreasing the payoff of another player.
Can we find a Pareto optimal strategy for the prisoner’s dilemma ? Consider the graphical
representation of the payoff matrix.
g2

(0, 4)

(3, 3)

(1, 1)
(4, 0)
g1

Figure 1: Prisoners’ dilemma payoff profiles

We see that (A, A) is Pareto-dominated by (S, S), as both players increase strictly their payoff.
However, (A, S) and (S, A) are not Pareto-dominated as any deviation decreases the payoff of
one of the players.

Example 3 (Second price auction). Consider n players (bidders) and one good for sale. Each
bidder i has a personal value for the good vi ≥ 0. If i buys the good at price p, his payoff is
vi − p. The rules of the game are:

1. Each player i chooses a bid bi ≥ 0 simultaneously.


2. The player with the highest bid wins the auction and gets the good at price p := max{bj :
j ̸= i}, that is, the second highest bid.

We can formally describe the payoff function gi of a player i.



 0 if bi < maxj bj
gi (b1 , b2 , ..., bn ) := vi − p if bi > maxj̸=i bj =: p
 vi −p
|K| if bi = maxj̸=i bj , where K := {k : bk = maxj bj }

The third case corresponds to a tie, where the winner is chosen randomly between the highest
bidders. What should the players bid ?

Theorem 1 (Vickrey). The strategy b∗i = vi is a weakly dominant strategy for player i.

Proof. Let us prove the theorem. Let vi the valuation of player i for the good and consider
p := maxj̸=i bj . Let split the study in cases:

ˆ Suppose vi > p. Then, player i wins if he bids bi = vi . Notice that,

ˆ If bi ∈ (0, p), i wins 0,


ˆ If bi ∈ (p, vi ), i wins vi − p,
ˆ If bi > vi , i wins vi − p

ˆ Suppose vi < p. Then, player i loses if he bids bi = vi .

5
ˆ If bi ∈ (0, p), i wins 0,
ˆ If bi ∈ (vi , p), i wins 0,
ˆ If bi > p, i wins vi − p < 0

In any case, player i does not have incentives for bedding different from vi . We conclude that
bi = vi is a weakly dominant strategy.

Example P 4. Consider a game with n players. Each of them can choose any number si ∈ [0, 100].
Let s̄ := n1 ni=1 si . The winner of the game is the closest player to 23 s̄. What should we choose?

The iterated deletion of strictly dominated strategies method, as its name says, consists in
eliminating one by one all the strategies that are strictly dominated, until arriving to a game
without strictly dominated strategies. The previous example can be solved by the IDSDS
method.

Definition 4. A game G is dominance-solvable if G(∞) , the last game obtained by the iterated
(∞)
deletion of strictly dominated strategies is trivial i.e. Si = {s∗i } for any player i or g is
constant in S (∞) .

Remark 1. The IDSDS is independent of the order of deletion, unlike the ID-weakly-DS.

6
2 Non-cooperative games: Zero-sum games
As a first example of non-cooperative games, we will study zero-sum games. In this setting, the
players have opposite evaluations of outcomes. Formally, we work with only two players (n = 2),
each of them endowed with a set of actions S1 , S2 and a payoff function g1 , g2 : S1 × S2 → R,
respectively. In addition, the payoff functions satisfy,

g1 (s1 , s2 ) + g2 (s1 , s2 ) = 0, ∀s1 , s2 ∈ S1 × S2

In words, the payoff of a player always corresponds to minus the payoff of the other player. For
simplicity, we make a change of notation. We pass to write I = S1 , J = S2 and we use only one
payoff function g = g1 = −g2 : I × J → R. We say that player 1 wants to maximize the payoff
of the game, while player 2 wants to minimize it.

Example 5. Rock, paper, scissor is a zero-sum game and its payoff matrix can be rewritten as,

R P S R P S
R 0,0 -1,1 1,-1 R 0 -1 1
P 1,-1 0,0 -1,1 P 1 0 -1
S -1,1 1,-1 0,0 S -1 1 0

Table 3: Rock, paper, scissor game Table 4: Same game, reduced matrix
As players in a zero-sum game are antagonists, we study the minimum payoffs that each of
them can guarantee when playing i.e. payoffs that each player can obtain independently of the
strategy of the opponent.

Definition 5. Player 1 guarantees w1 ∈ R if there exists i ∈ I such that for any j ∈ J,


g(i, j) ≥ w1 . Similarly, player 2 guarantees w2 ∈ R if there exists j ∈ J such that for any i ∈ I,
g(i, j) ≤ w2 .

An important case holds when both players can guarantee the same real value w.

Definition 6. Given a zero-sum game G = (I, J, g), we consider two important coefficients,

ˆ maxmin of G v(g) := sup inf g(i, j)


i∈I j∈J
ˆ minmax of G v(g) := inf sup g(i, j)
j∈J i∈I

Remark that it always holds v(g) ≤ v(g). The maxmin is the highest value that player 1 can
guarantee, while the minmax is the lowest value that player 2 can guarantee.

Example 6 (Matching Pennies). Consider the following game: two friends, each of them with
a penny, show one of the faces. If the faces match, the first friend has to pay 1 euro to the
second friend, otherwise the second friend pays 1 euro to the first friend. The matrix payoff of
this game can be written as,

1 -1
-1 1

Table 5: Matching Pennies

Compute the maxmin and minmax of this game.

Definition 7. The game G has a value v(g) if v(g) = v(g) = v(g).

7
In the Matching Pennies example, as v = −1 < 1 = v, this game does not have a value (however,
will see later that the value exists when players play mixed strategies).

Remark 2. If both players can guarantee the same value w then, w is the value of the game.
In addition, the value of a game is unique, so if both players can guarantee two values w1 , w2 ,
there must hold that w1 = w2 .
Definition 8. Let G = (I, J, g) be a zero-sum game with a value v(g). We say that,
ˆ Player 1 has an optimal strategy i∗ ∈ I if i∗ guarantees the value v(g),
ˆ Player 2 has an optimal strategy j ∗ ∈ J if j ∗ guarantees the value v(g),
Example 7. Consider the zero-sum game G = (N, N, g), where g(i, j) = 1/(1 + i + j). The
value of the game is v = 0, any strategy of player 1 is an optimal strategy while player 2 does
not have any optimal strategy.
Remark 3. The existence of a value without optimal strategies is due to the presence of supre-
mum and infimum in the definitions of v and v.

2.1 Mixed extension of zero-sum games


For continuing the study of the value of a zero-sum game, we consider the case of finite zero-sum
games. A finite game is any game with a finite number of players, each of them with a finite
set of strategies. A finite zero-sum game satisfies in addition that g1 = −g2 . An important
property of finite games is the fact that we can represent the payoffs of the players by a payoff
matrix.
Formally, suppose we have two players with strategy sets I := {1, 2, ..., I}, J = {1, 2, ..., J}
respectively, and g1 = −g2 = g. We define the payoff matrix A ∈ R|I|·|J| by
Ai,j = g(i, j), ∀(i, j) ∈ I × J
Therefore, if player 1 chooses i ∈ I and player 2 j ∈ J, the outcome of the game is Ai,j .
In a game it is natural to allow to the players to play a strategy randomly, that is, to choose
their next strategy from their set of strategies in a random way. For example, in Matching
Pennies we may want to show the face of the coin with probability 1/2, hiding the possible
result to the opponent. The fact of randomizing over the set of strategies is known as playing
mixed strategies.
Definition 9. Let I, J be the strategy sets of the players respectively. We define the set of
mixed strategies over I and J, respectively, as the set of all probability distributions over I and
J. Formally, we define,
( )
X
I
∆(I) := x ∈ R , xi ≥ 0, ∀i ∈ I, xi = 1 ,
i∈I
 
 X 
∆(J) := y ∈ RJ , yj ≥ 0, ∀j ∈ J, yj = 1
 
j∈J

The sets ∆(I), ∆(J) are known as the simplex of I and J, respectively. The mixed extension
of a finite game G = (I, J, g) is thus the game Γ = (∆(I), ∆(J), g) where the payoff function
corresponds to the expected payoff of the players,
X
g(x, y) := xi Ai,j yj , ∀x, y ∈ ∆(I) × ∆(J)
i,j∈I×J

For simplicity, we will write the mixed extensions of the payoff function g simply as g.

8
Example 8. Let us come back to the Matching Pennies game. Assuming that player 1 picks
heads with probability x1 and tails with probability x2 , such that x1 +x2 = 1, and similarly player
2 picks heads and tails with probabilities y1 and y2 respectively, the payoff function becomes,
g(x, y) = 1 · x1 y1 + (−1) · x1 y2 + (−1) · x2 y1 + 1 · x2 y2
There are two important remarks to do about the mixed extension of a finite game.
1. The expected payoff can be written in a matrix way: Let (x, y) ∈ ∆(I) × ∆(J), then,
 
y1
xi Ai,j yj = [x1 , ..., xI ] · A ·  ...  = xT Ay
X
g(x, y) =
 
i∈I,j∈J yJ
P
2. Given i ∈ PI a pure strategy, we can identify i ≈ δi ≈ ⃗ei . Thus, if x ∈ ∆(I), x = i∈I xi⃗ei
such that i∈I xi = 1. This point tells us that the simplex ∆(I) is generated as the convex
hull of the canonical basis in RI .

(0, 0, 1)

∆(I)

(1, 0, 0)

(0, 1, 0)

Figure 2: Simplex ∆(I)

2.2 Minmax Theorem


We are ready to state Von Neumann’s theorem.
Theorem 2 (Von Neumann minimax). Let G = (∆(I), ∆(J), g) be a zero-sum game in mixed
strategies. Then, there exist (x∗ , y ∗ , v) ∈ ∆(I) × ∆(J) × R such that
x∗ Ay ≥ v, ∀y ∈ ∆(J) and xAy ∗ ≤ v, ∀x ∈ ∆(I)
In words, any finite zero-sum game has a value in mixed strategies and both players have optimal
strategies. The value v is uniquely determined and corresponds to the value of the game,
v = max min xT Ay = min max xT Ay
x∈∆(I) y∈∆(J) y∈∆(J) x∈∆(I)

In addition to the minmax theorem, the following result holds.


Proposition 1. Indifference principle. Let (x̄, ȳ) be optimal strategies and v be the value of the
game. Then it holds,
∀i ∈ I, x̄i > 0 =⇒ ⃗ei Aȳ = v,
∀j ∈ J, ȳj > 0 =⇒ x̄A⃗ej = v
In words, any pure strategy played with some positive probability in equilibrium, is able to achieve
also the value of the game when the opponent keeps playing in equilibrium. Conversely, if a pure
strategy does not achieve the value of the game when the opponent plays in equilibrium, then
this pure strategy is played with probability zero.
For proving these two results we will use duality in linear programming.

9
2.2.1 Linear programming
A linear program is,
 P
 min ci · xi
i∈I

 P
(P ) s.t. Ai,j xi ≥ bj , ∀j ∈ J

 i∈I
xi ≥ 0, ∀i ∈ I

where A ∈ R|I|·|J| is a matrix, and b ∈ R|J| , c ∈ R|I| are vectors, the three of them known and
fixed. The dual problem of (P ) corresponds to another linear program (D) given by,
 P

 max bj · yj
P j∈J


(D) s.t. Ai,j yj ≤ ci , ∀i ∈ I


 j∈J
 yj ≥ 0, ∀j ∈ J

There are two important results to keep in mind about linear programming.

Proposition 2. Given a linear program (P ), finding a solution or declaring that the problem
is unfeasible is polynomial

Theorem 3 (Strong Duality). If (P ) and (D) are both feasible, then both have optimal solutions
and they share the same optimal value. In addition, if (x∗ , y ∗ ) are a pair of solutions of the
primal-dual problems, it holds,
X
∀i ∈ I : xi > 0 ⇐⇒ Ai,j yj = ci ,
j∈J
X
∀j ∈ J : yj > 0 ⇐⇒ Ai,j xi = bj
i∈I

2.2.2 Proving the Von Neumann’s theorem


Let G = (∆(I), ∆(J), A) be a zero-sum game in mixed strategies with A the matrix payoff of
the game. We will prove that G has a value and optimal strategies by defining a suitable linear
program and using the strong duality theorem.
Before of proving the theorem, we make a useful remark. Notice that,

max xT Ay = max ⃗ei Ay (1)


x∈∆(I) i∈I

Indeed, as the function f (x) := xT Ay is linear and continuous in x, and the set ∆(I) is compact,
there always exists an extreme point of the feasible region that is maximum of f (·). In addition,
as the extreme points of a simplex are the pure strategies, represented by the canonical vectors
⃗ei , the result holds. Analogously we obtain,

min xT Ay = min xT A⃗ej (2)


y∈∆(J) j∈J

With this in mind, the minmax theorem can be rewritten as,

max min xT A⃗ej = min max ⃗ej Ay (3)


x∈∆(I) j∈J y∈∆(J) i∈I

Therefore, we focus in proving (3). First of all we assume, without loss of generality, that A has
only positive entries. Indeed, notice we can add λ > 0 to all the coordinates of A so they become

10
positive, without affecting the equality (3) as we obtain a factor λ at both sides. Consider the
following linear program,
 P
 min 1 · xi
i∈I

 P
(P ) s.t. Ai,j xi ≥ 1, ∀j ∈ J

 i∈I
xi ≥ 0, ∀i ∈ I

We notice that we have used b = (1)j∈J , c = (1)i∈I , that is, both are vectors with only ones in
every coordinate. The dual of (P ) is given by,
 P

 min 1 · yj

 P j∈J
(D) s.t. Ai,j yj ≤ 1, ∀i ∈ I


 j∈J
 yj ≥ 0, ∀j ∈ J

Notice that (D) is feasible as y ≡ 0 is a feasible solution. Similarly, as A has only positive
entries, (P ) is feasible as taking any x ≫ 0 (a vector with every coordinate large enough)
is a feasible solution. Therefore, by the strong duality theorem, there exists (x∗ , y ∗ ) optimal
solutions of (P ) and (D) respectively, and w ∈ R such that,

x∗ ≥ 0, y ∗ ≥ 0,
X
Ai,j x∗i ≥ 1, ∀j ∈ J
i∈I
X
Ai,j yj∗ ≤ 1, ∀i ∈ I
j∈J
X X
x∗i = yj∗ = w
i∈I j∈J

We claim that w > 0. First of all, w ≥ 0 as w is equal ∗


P to∗ the sum of x , the one has ∗every
P w = 0,∗ then i∈I xi = 0, and then necessarily
coordinate non-negative. Suppose that x ≡ 0,
∗ ∗
as we know that x ≥ 0. However, i∈I Ai,j xi ≥ 1, ∀j ∈ J, and this cannot hold if x ≡ 0. We
obtain then, that w > 0. Consider next (x̄, ȳ) = (x∗ /w, y ∗ /w), the ones satisfy,

x̄ ≥ 0, ȳ ≥ 0,
X X
x̄i = ȳj = 1
i∈I j∈J

In other words, x̄ and ȳ are probability distributions. We claim finally that these are the optimal
strategies of the players and thus, they achieve the value of of the game. Indeed, notice that,
X 1
x̄T A⃗ej = x̄i Ai,j ≥ , ∀j ∈ J
w
i∈I

therefore,
max min xT A⃗ej ≥ min x̄T A⃗ej ≥ 1/w
x∈∆(X) j∈J j∈J

Analogously, miny∈∆(Y ) maxi∈I ⃗ej Ay ≤ 1/w. Thus, recalling (1) and (2),

max min xT Ay ≥ min max xT Ay


x∈∆(X) y∈∆(J) y∈∆(Y ) x∈∆(I)

As the maxmin of G is always lower or equal than the minmax of G, we conclude the equality
and the proof of the theorem.

11
Remark 4. Recalling that solving a linear programming problem is Polynomial, we conclude
that solving a finite zero-sum game, that is, to compute its value and its optimal strategies, can
be done in polynomial time.
Notice that for proving Von Neumann’s theorem we have only used the first part of the strong
duality theorem. The indifference principle’s proof comes from the second part.

Proof. Indifference principle. The proof is a direct application of the second part of the strong
duality theorem. Let (x̄, ȳ) be optimal strategies so they achieve the value v of the game. As we
did for the minimax theorem, we can assume without loss of generality v > 0 by taking A > 0.
We known that v1 (x̄, ȳ) is a solution of the pair of problems (P ) and (D). Using the second part
of the strong duality theorem it holds,
1 X 1
∀i ∈ I, x̄i > 0 =⇒ Ai,j ȳj = 1,
v v
j∈J
1 X 1
∀j ∈ J, ȳj > 0 =⇒ Ai,j x̄i = 1
v v
i∈I

As v > 0, we obtain,
X
∀i ∈ I, x̄i > 0 =⇒ Ai,j ȳj = v,
j∈J
X
∀j ∈ J, ȳj > 0 =⇒ Ai,j x̄i = v
i∈I

that is exactly what we look for.

2.2.3 Applying what we have learned


Example 9 (Rock, paper and scissor). Let us recall the rock, paper and scissor game and
compute its value and optimal strategies.

R P S
R 0 -1 1
P 1 0 -1
S -1 1 0

Solving a zero-sum game with our current tools can be done in two ways: either we guess
the value of the game and then we use Proposition 1 for computing the optimal strategies, or
we guess the optimal strategies, compute the value of the game and prove that our guess was
correct. For a game as rock, paper and scissor in which everything is symmetric, a good guess
is to propose the average of all payoffs as a value. In this case, we propose v = 0 as the value
of the game. Let compute the optimal strategies. Let x = (x1 , x2 , x3 ) be an optimal strategy
for player 1 and y = (y1 , y2 , y3 ) be an optimal strategy for player 2. In particular, x and y are
probability distributions so each coordinate is non-negative and the sum of all of them has to be
equal to one. Let suppose x1 , x2 , x3 > 0, so player 1 plays the three strategies with some positive
probability. From Proposition 1 it follows,

x1 > 0 =⇒ 0 · y1 + (−1) · y2 + 1 · y3 = 0 
x2 > 0 =⇒ 1 · y1 + 0 · y2 + (−1) · y3 = 0 y1 = y2 = y3
x3 > 0 =⇒ (−1) · y1 + 1 · y2 + 0 · y3 = 0

Since y1 + y2 + y3 = 1, we conclude that y = (1/3, 1/3, 1/3). In particular, we obtain that player
2 plays with positive probability all her strategies, so we can repeat the computations for player

12
1 now. Analogously, we obtain that x = (1/3, 1/3, 1/3). In words, both players play the three
strategies with the same probability. It makes sense to get a symmetric equilibrium in which both
agents play the same mixed strategy as both of them have the same symmetric matrix payoff.

The pure strategies that are played with positive probability have a name.

Definition 10. Let x ∈ Rn+ be a vector. We define the support of x as the set,

supp(x) := {i ∈ {1, ..., n} : xi > 0}

That is, the set of all the strictly positive coordinates of the vector x.

As a final remark of zero-sum games, let us come back to the equilibrium of the rock, paper
and scissor game. Notice that, if (x̄, ȳ) is the equilibrium we computed,

g(R, ȳ) = g(P, ȳ) = g(S, ȳ)


g(x̄, R) = g(x̄, P ) = g(x̄, S)

That is, if one of the players plays the equilibrium, the opponent is indifferent to play any
option in his support as he always gets the same payoff. This is not a coincidence as we will
see later in the course. However, if my opponent plays her optimal strategy, although I am
indifferent, the best to do is to also play the optimal strategy. Otherwise, the opponent can
change of strategy and decreases my payoff. For example, if my opponent plays rock, paper and
scissor with probability 1/3 each, although my payoff is the same when playing any of the three
pure strategies, if I decide to play always rock, my opponent could notice it and change to play
always paper, decreasing my payoff. Therefore, the best response to an optimal strategy is to
play an optimal strategy.

3 Non zero-sum games


Let G = (N, (Si )i∈N , (gi )i∈N ) be a game with N = {1, ..., n} being
Q the set of players, each
of them having a set of strategies Si andQa payoff function gi : i∈N Si → R. We introduce
some useful
Q notation: We denote S := i∈N Si the space of strategy profiles of all players,
S−i := j̸=i Sj the space of strategy profiles of all players but player i, and s−i := (sj )j̸=i ∈ S−i
a strategy profile containing the action played by all players except player i. Notice that any
s ∈ S can be written as s = (si , s−i ) for some player i ∈ N .

Example 10 (Coordination game). Consider the following payoff matrix,

F T
F 2,1 0,0
T 0,0 1,2

Table 6: Coordination game

Two friends try to decide between going to a football match or to the theater. The first friend
prefers the football to the theater, while the second one prefers the theater to the football match.
However, independently of the chosen event, both prefer to go together to the same place rather
than going to different ones.

To study this kind of games, we draw arrows indicating the improvement of the players’ payoffs.
In this case we obtain,

13
F T
F 2, 1 0, 0

T 0, 0 1, 2

Figure 3: Coordination game

An equilibrium will be any strategy profile in which no agent can improve his payoff by unilat-
erally deviating. Seen at the matrix, an equilibrium will be any point that only has incoming
arrows. For our example, we find two equilibria: (F, F ) and (T, T ).

Definition 11. A strategy profile s∗ ∈ S = i∈N Si is a Nash equilibrium if for any i ∈ N ,


Q

gi (s∗i , s∗−i ) = max gi (si , s∗−i )


si ∈Si

Another way of expressing this condition is that for any i ∈ N ,

gi (s∗i , s∗−i ) ≥ gi (si , s∗−i ), ∀si ∈ Si

In any case, no player has a profitable unilateral deviation, i.e.

∄i ∈ N, ∄si ∈ Si : gi (si , s∗−i ) > gi (s∗ )

Remark 5. We have many remarks about Nash equilibria,

1. In a two-player zero-sum game, a strategy profile (s∗ , t∗ ) is a Nash equilibrium if and only if
it achieves the value of the game and (s∗ , t∗ ) are optimal strategies for the players.
2. Eliminating strictly dominated strategies does not change the equilibrium of the game.
3. Therefore, IDSDS does not change the equilibria of the game.

When players in a game play a Nash equilibrium, each of them is maximizing his payoff with
respect to the strategies played by the other players. We say that players are best replying to
the strategies of the other players.

Definition 12. Let i ∈ N be a player. We define his best reply function as,
Y
BRi : S−i = Si′ → Si
i′ ̸=i

such that for any s−i ∈ S−i it outputs,

BRi (s−i ) := argmax gi (si , s−i )


si ∈Si

That is, given a strategy profile for all the players but i, i’s best reply function outputs literally
the best reply that i can make for maximizing his payoff.

Notice that BRi is a multifunction (also called an application or a point-to-set mapping), as it


associates a subset of strategies in Si to each s−i . Using the best reply functions we can give
another definition for Nash equilibria.

Definition 13. A strategy profile s∗ ∈ S is a Nash equilibrium if and only if,

∀i ∈ N, s∗i ∈ BRi (s∗−i )

14
Finding a Nash equilibrium in a payoff matrix is equivalent to check the best reply of each agent
fixing the strategy played by the others players. As we did for the coordination game, we can
draw lines expressing the best reply of each player. Let us see another example.

Example 11. Consider the following payoff matrix,

L M R
T 1,8 4,-1 -1,2
M 2,7 6,0 2,1
B 3,3 6,2 0,1

Suppose that player 2 plays L. The best reply of player 1 is to play B. Suppose that player 1
plays M . Then, the best reply of player 2 is to play L. Doing the same for all the possible cases,
we can highlight each of the best reply of both players. If we find a box of the matrix totally
highlighted, the corresponding strategy profile is a Nash equilibrium. In the example, we find
that (B, L) is an equilibrium.

Example 12 (Cournot’s game). In the 19th century Cournot defined his duopoly competition
model. Consider two firms i ∈ {1, 2}, the one choose a quantity qi to produce from a certain
good (same good). We assume that firms can produce up to a certain level a > 0 of the good,
so firms have the same strategy set S1 = S2 = [0, a]. The cost of production is linear in the
produced quantity so
Ci (qi ) = cqi ,
where c > 0 (same production cost for both firms). The market price p is also assumed to be
linear on the total production level so,

p = max{a − (q1 + q2 ), 0}

Finally, once a firm i chooses its strategy (its level of production), its payoff is,

gi (q1 , q2 ) := pqi − Ci (qi ) = max{a − (q1 + q2 ), 0}qi − cqi

Finally, we assume a > c (why?). Let us compute the Nash equilibrium of this game. We use
the best reply functions. Suppose firm 2 produces s2 . Firm 1’s best reply function corresponds
to (let forget the maximum in the market price for now),

BR1 (s2 ) = argmax{as1 − (s1 + s2 )s1 − cs1 }

We derive and equalize to 0 (First order condition) for obtaining,

∂g1
= 0 ⇐⇒ a − 2s1 − s2 − c = 0
∂s1
a − c − s2
⇐⇒ s∗1 := −→ Best reply function firm 1
2
Since firms have symmetric payoff functions, given s1 the best reply of firm 2 has to be,
a − c − s1
s∗2 = −→ Best reply function firm 2
2
We obtain therefore, a system with two equations and two variables. Solving it we find that the
Nash equilibrium of the game is,
a−c
s∗1 = s∗2 =
3
How can we see if this game has another Nash equilibrium? Let plot the best reply functions.

15
s2
a−c
BR1

a−c
2

BR2

a−c
s1
2 a−c

Since the best reply functions have only one intersection point, we can conclude there is only
one Nash equilibrium in the Cournot competition.

3.1 Finite games in mixed strategies


As we did for zero-sum games, non zero-sum games can be also extended to mixed strategies.
Consider G = (N, (Si )i∈N , (gi )i∈N ) a finite game such that each player has a finite number of
pure strategies. For every i ∈ N consider,
 
 
|S |
X
Σi = ∆(Si ) = σi ∈ R+ i : σi (si ) = 1
 
si ∈Si

We define the extended game G = (N, (Σi )i∈N , (g i )i∈N ) where,


X X
g i (σ) = g i (σ1 , ..., σN ) = Eσ (gi ) := ... σ1 (s1 ) · ... · σN (sN ) · gi (s1 , ..., sN )
s1 ∈S1 sN ∈SN

The game G is called a finite game in mixed strategies. For simplicity, we keep denoting gi to
the mixed version of the payoff function of player i and not ḡi .

Example 13. Consider the following game in pure strategies,

L M R
T 1,4 0,4 2,6
B 3,2 6,1 5,2

The mixed strategy sets of each player are,

Σ1 = {σ = (σT , σB ) ∈ R2+ : σT + σB = 1},


Σ2 = {σ = (σL , σM , σR ) ∈ R3+ : σL + σM + σR = 1}

Given a mixed strategy profile (σ 1 , σ 2 ) ∈ Σ1 × Σ2 , the expected payoffs are given by,

g1 (σ1 , σ2 ) = 1 · σT1 · σL2 + 0 · σT1 · σM


2
+ 2 · σT1 · σR
2 1
+ 3 · σB · σL2 + 6 · σR
1 2
· σM 1
+ 5 · σB 2
· σR
g2 (σ1 , σ2 ) = 4 · σT1 · σL2 + 4 · σT1 · σM
2
+ 6 · σT1 · σR
2 1
+ 2 · σB · σL2 + 1 · σR
1 2
· σM 1
+ 2 · σB 2
· σR

Definition 14. A strategy profile σ ∗ ∈ i∈N Σi is a Nash equilibrium in mixed strategies of the
Q
game G if it is a Nash equilibrium of the extended game G. That is, if it satisfies,

∀i ∈ N, gi (σi∗ , σ−i
∗ ∗
) ≥ gi (τi , σ−i ), ∀τi ∈ Σi

16
Theorem 4 (Nash). A finite game always has a Nash equilibrium in mixed strategies.

The proof of Theorem 4 uses the following result, known as the Brower’s fixed point theorem.

Theorem 5 (Brower). Let C be a non-empty convex and compact subset of a finite-dimensional


Euclidean space. Then, any continuous function f : C → C has a fixed point, that is, ∃x ∈ C :
f (x) = x.

We will not prove Brower’s theorem but only use it for proving Theorem 4.

Proof of Theorem 4. Let G = (N, ∆(Si ), gi ) be a finite game in mixed strategies. Consider ∆ =
Q
i∈N ∆(Si ) and define the gain function of player i ∈ N by Gaini (σ, a) = max{0, gi (a, σ−i ) −
g(σ)}, that is, the difference of payoff that player i gets when she deviates from σi to a ∈ Si .
Then, consider the following function f : ∆ → ∆,

σi (si ) + Gaini (si , σ)


f (σ)i (si ) = P , ∀i ∈ N (4)
1 + ti ∈Si Gaini (ti , σ)

Let me explain function f . Function f takes as an argument a profile of probability distributions


(σ) = (σi )i∈N ∈ ∆. Then, as f arrives to ∆, f (σ) is also a profile of probability distributions,
i.e. f (σ) = (f (σ)i )i∈N ∈ ∆. Therefore, Equation (4) describes the probability that player i
plays the strategy si given the probability distribution f (σ)i .
Notice that f is well defined as f (σ)i is truly a probability distribution for every i ∈ N . Indeed,
f (σ)i (si ) ≥ 0, ∀si ∈ Si and,
X X σi (si ) + Gaini (si , σ)
f (σ)i (si ) = P
1 + ti ∈Si Gaini (ti , σ)
si ∈Si si ∈Si
P P
si ∈Si σi (si ) + si ∈Si Gaini (si , σ)
= P
1 + ti ∈Si Gaini (ti , σ)
P
1 + si ∈Si Gaini (si , σ)
= P =1
1 + ti ∈Si Gaini (ti , σ)

Since ∆ is convex and compact and f is a continuous function, by Brower’s theorem there exist
σ ∗ ∈ ∆ such that f (σ ∗ ) = σ ∗ . We claim that σ ∗ is a Nash equilibrium. Let i ∈ N be a fixed
player. We want to prove that for any si ∈ Si , Gaini (si , σ ∗ ) = 0, meaning that i is best replying
∗ . Let assume that the gains are not all zero. Therefore, there exist i ∈ N and s ∈ S
to σ−i i i
such that Gaini (si , σ ∗ ) > 0. Notice that,

σi∗ + Gaini (·, σ ∗ )


σ ∗ = f (σ)∗ ⇒ σi∗ =
1 + ti ∈Si Gaini (ti , σ ∗ )
P

Gaini (·, σ ∗ )
⇒ σi∗ = P ∗
(5)
ti ∈Si Gaini (ti , σ )

Finally, we prove the following equality as it will be useful for the end:

σi∗ (si ) · (gi (si , σ−i



) − gi (σ ∗ )) = σi∗ (si ) · Gaini (σ ∗ , si ), ∀si ∈ Si (6)

When Gaini (σ ∗ , si ) > 0, the previous equality holds by the definition of the function Gain.
Similarly, when Gaini (si , σ ∗ ) = 0, from Equation (5), we obtain that σi∗ (si ) = 0 and then, both

17
sides of Equation (6) are equal to 0. Finally, we observe the following,

0 = gi (σ ∗ ) − gi (σ ∗ )
X
= σ ∗ (si )gi (si , σ−i

) − gi (σ ∗ )
si ∈Si
X X
= σ ∗ (si )gi (si , σ−i

) − gi (σ ∗ ) σ ∗ (si )
si ∈Si si ∈Si
X
∗ ∗ ∗
= σ (si )[gi (si , σ−i ) − gi (σ )]
si ∈Si
(6) X
= σ ∗ (si )Gaini (σ ∗ , si )
si ∈Si
(5) X X
= (σ ∗ (si ))2 Gaini (ti , σ ∗ ) > 0
si ∈Si ti ∈Si

where the last inequality holds as we know that both sums are strictly positive. We obtain
a contradiction with the fact that the gains are not all zero. We conclude that σ ∗ is a Nash
equilibrium.

Example 14. Consider the following coordination game,

L R
T 2,1 0,0
B 0,0 1,3

We saw last time that (L, T ) and (B, R) where both of them Nash equilibria. However, can we
find another one when considering the mixed extension of this game? Consider that player 1
plays Top with probability x and Bottom with probability 1 − x. Analogously, player 2 plays Left
with probability y and Right with probability 1 − y. Let see the best response functions of the
players,

BR1 (y) = max g1 (x, y) = max 2xy + 1(1 − x)(1 − y),


x∈[0,1] x∈[0,1]

BR2 (x) = max g2 (x, y) = max xy + 3(1 − x)(1 − y)


y∈[0,1] y∈[0,1]

The solution to the previous optimization problems depends on the strategy of the another player.
For BR1 for example, notice that,

BR1 (y) = max x[3y − 1] + (1 − y)


x∈[0,1]

and therefore, the optimal x∗ depends on the sign of [3y − 1]. We obtain the following solution,
 ∗
 x =1 if 3y − 1 > 0 ⇔ y > 1/3
BR1 (y) ∗
x =0 if y < 1/3
 ∗
x ∈ [0, 1] if y = 1/3

Analogously for player 2, we can find that,


 ∗
 y =1 if x > 3/4
BR2 (x) y∗ = 0 if x < 3/4
 ∗
y ∈ [0, 1] if x = 3/4

We can plot the Best reply functions to find graphically the Nash equilibria.

18
y
BR2
1

BR1

1
3

3
x
4 1

The two best reply functions have three intersection points, each of them being a Nash equilib-
rium. The points (x, y) = (0, 0) and (x, y) = (1, 1) are the two already known Nash equilibria in
pure strategies. However, we obtain a third one (x, y) = (3/4, 1/3), being a Nash equilibrium in
mixed strategies.
Finally, notice the following,
g2 (x∗ , L) = 1 · x∗ + 0 · (1 − x∗ ) = x∗
g2 (x∗ , R) = 0 · x∗ + 3 · (1 − x∗ ) = 3 · (1 − x∗ )
and g2 (x∗ , L) = g2 (x∗ , R) if and only if x∗ = 3/4. Therefore, if player 1 is playing his mixed
optimal strategy x∗ , player 2 is indifferent about what to play, as any strategy (pure or mixed)
gives her the same utility. This remark always holds. Even more, we can use it for computing
Nash equilibria as it imposes constraints over the strategies (indifference principle!).
Let us apply the previous remark to the first coordination game with saw,

F T
F 2,1 0,0
T 0,0 1,2

Let x and y be the probability that player 1 and player 2 choose to go to the football, respectively.
We impose indifference on the payoffs,
g1 (F, y) = g1 (T, y) ⇔ 2y + 0 · (1 − y) = 0 · y + 1 · (1 − y) ⇔ y = 1/3
Analogously for player 2,
g2 (x, F ) = g2 (x, T ) ⇔ 1x + 0 · (1 − x) = 0 · x + 2 · (1 − x) ⇔ x = 2/3
We obtain that (x∗ , y ∗ ) = (2/3, 1/3) is also a Nash equilibrium, in this case in mixed strategies.
Intuitively, each player chooses with a higher probability his more preferred option, although
with some positive probability he also chooses the second option as both players are better off
when choosing the same activity.
Definition 15. Let x ∈ Rn+ be a vector. We define the support of x as the set,
supp(x) := {i ∈ {1, ..., n} : xi > 0}
That is, the set of all the strictly positive coordinates of the vector x.
Let us summarize in the following scheme how to compute the Nash equilibria of a finite game
with two players.
Nash equilibria computation

19
1. Find the pure Nash equilibria of the game by underlining in each column and row the best
reply of each player and look for strategy profiles totally underlined.
2. Assume that agents play mixed strategies x, y ∈ ∆(S1 ) × ∆(S2 ) and impose the indifference
principle (be careful with the pure strategies that may not belong to the support):
g1 (s, y) = g1 (t, y), ∀s, t ∈ S1 : xs , xt > 0
g2 (x, s) = g2 (x, t), ∀s, t ∈ S2 : xs , xt > 0
Solve the system of equations taking care of computing coherent values for x and y. The
solutions correspond to the mixed Nash equilibria of the game.
3. An alternative method to steps 1 and 2 is to compute the best reply functions
BR1 (y) = max g1 (x, y), BR2 (x) = max g2 (x, y)
x∈∆(S1 ) y∈∆(S2 )

and plotting them for obtaining all their intersection points. This also works for counting
the number of Nash equilibria.

For ending the study of finite games, we state a last theorem, the one will not be proved.
Theorem 6. Let G be a finite game and G be its mixed extension. The number of Nash
equilibria of G is “always” finite and odd.

3.2 Computational aspects of Nash equilibria


Let us spend some time on the computational aspects of Nash equilibria. For that, let start
briefly discussing the classes of polynomial and non-polynomial problems.

3.2.1 Classes P and NP


The Class P contains all the decision problems (any problem corresponding to find a solution or
returning “no” if the problem is unfeasible) that can be solved by a computer using a polynomial
amount of computation time, or polynomial time. Some notables problems in P are,
1. Linear programming,
2. Calculating the greatest common divisor of two numbers,
3. To determine if a number x is prime or not,
4. st-Connectivity in a graph
The Class NP, or non-deterministic polynomial time, contains all the decision problems for
which the problem’s instances, where the answer is “yes”, have proofs verifiable in polynomial
time by a computer. In particular, any problem in the class P belongs to the class N P .
Definition 16. A problem is said to be NP-hard if everything in N P can be transformed, in
polynomial time, into it. A problem is NP-complete if it is both in N P and N P -hard.
N P -complete problems represent the hardest problem in N P . If any of them has a polynomial
time algorithm, all problems in N P do. The following list corresponds to some well-known
N P -complete problems, when they are expressed as decision problems.
1. Boolean satisfiability problem 7. Clique problem
2. Knapsack problem 8. Vertex cover problem
3. Hamiltonian path problem 9. Independent set problem
4. Traveling salesman problem 10. Dominating set problem
5. Subgraph isomorphism problem 11. Graph coloring problem
6. Subset sum problem

20
There is often only a small difference between a problem in P and a N P -complete problem:
Determining whether a graph can be colored with 2 colors is in P , but with 3 colors is in
N P -complete.
Concerning the computation of Nash equilibria, we obtain that this problem is by nature, com-
binatorial. Indeed, the indifference principle tells us that a mixed strategy if a Nash equilibrium
if and only if all pure strategies in its support are best responses. This result reveals the subtle
nature of a mixed Nash equilibrium: Players combine pure best responses strategies when play-
ing in equilibrium. Finding a Nash equilibrium means, therefore, finding the right supports:
Once done it, the precises mixed strategies can be computed by solving a system of algebraic
equations. However, given a set of pure strategies for each player, the number of possible
supports of a mixed strategy profile is exponential.
Most of algorithms proposed over the past half century for finding Nash equilibria are combi-
natorial in nature, and work by seeking supports. Unfortunately, non of them are known to be
efficient - to always succeed after only a polynomial number of steps.
It is Nash equilibrium problem NP-complete? The difficult of finding a Nash equilibrium gives
the intuition that this problem may be N P -complete. However, this is not the appropriated
class for it, as the existence of a solution is always guaranteed thanks to Nash’s theorem. Indeed,
in 1994, Christos Papadimitriou defined a new class of computation problems: The PPAD Class.

We can easily obtain N P -complete problems by slightly changing the problem: Given a two-
player game in strategic game form, does it have:

1. at least two Nash equilibria?


2. a Nash equilibrium in which player 1 has utility at least α?
3. a Nash equilibrium in which the two players have total utility at least α?
4. a Nash equilibrium with support of size greater than k?
5. a Nash equilibrium whose support contains the strategy s?
6. a Nash equilibrium whose support does not contain the strategy s?
7. etc...

Let us prove (3) by reducing the clique problem: Given an undirected graph Gr = (V, E) and
an integer k, does there exist a clique of size k in Gr? That is, does there exist V ′ ⊆ V, |V ′ | = k
such that (i, j) ∈ E for all i, j ∈ V ′ .
Let Gr = (V, E) be a graph with V = {1, ..., n} and let k be a natural number. Let ε = 1/nk,
M = nk 2 , r = 1 + ε/k, and consider the following two-person game G,

 1 + ε if i = j
g1 ((1, i), (1, j)) = g2 ((1, i), (1, j)) = 1 if i ̸= j, {i, j} ∈ E
0 otherwise


k if i = j
g1 ((2, i), (1, j)) = g2 ((1, i), (2, j)) =
0 if i ̸= j

−M if i = j
g1 ((1, i), (2, j)) = g2 ((2, i), (1, j)) =
0 if i ̸= j
g1 ((2, i), (2, j)) = g2 ((2, i), (2, j)) = 0

Considering ei,j = 1{(i,j)∈E} , the following matrix expresses the payoffs of this game.

21
P1 \ P2 (1, 1) ... (1, n) (2, 1) ... (2, n)
(1, 1) (1 + ε, 1 + ε) (−M, k)
(1 + ε, 1 + ε)
..
. . (eij , eij ) . (0, 0)
. .
(eij , eij ) . (0, 0) .
(1, n) (1 + ε, 1 + ε) (−M, k)
(2, 1) (k, −M )
(k, −M )
..
. . (0, 0) (0, 0)
.
(0, 0) .
(2, n) (k, −M )

Theorem 7. G has a Nash equilibrium with expected payoff of at least r for both players if and
only if Gr has a clique of size k.

22
4 Potential games
We have seen that any finite game has a Nash equilibrium in mixed strategies. It is still open
the question of which games have always a Nash equilibrium in pure strategies. We will study
the case of Potential games.
Potential games is a special case of N -players games in which players’ unilateral deviations can
be centralized by one function.
Definition 17. A game G = (N, (Si )i∈N , (gi )i∈N ) is a potential game if there exists a function
ϕ : S → R such that,

gi (si , s−i ) − gi (ti , s−i ) = ϕ(si , s−i ) − ϕ(ti , s−i ), ∀i ∈ N, ∀s ∈ S, ∀ti ∈ Si

The function ϕ is called a potential function. In the particular case that Si = R and gi is
a differentiable function over si , for every player, we can express the property of a potential
function by,
∂gi ∂ϕ
= , ∀i ∈ N
∂si ∂si
Example 15. Consider the following Cournot game,
 
n
X X
gi (s1 , ..., sn ) = A − sj  si − Csi = Asi − s2i − si sj − Csi
j=1 j̸=i

where si is the production level of the firm i and A, C ∈ R, with A > C. Take the function
 
n
X n
X n
X X Xn
2
ϕ(s1 , ..., sn ) = A sj − sj −  si · sj − C
 sj
j=1 j=1 i=1 j̸=i j=1

Fixing s−i ∈ S−i , notice that,


∂gi X
(si , s−i ) = A − 2si − sj − C
∂si
j̸=i
∂ϕ X
(si , s−i ) = A − 2si − sj − C
∂si
j̸=i

We conclude that the Cournot game is a potential game.


The interest in potential games comes from the following result.
Theorem 8. Let G = (N, (Si )i∈N , (gi )i∈N ) be a potential game and ϕ its potential function.
Let s∗ ∈ argmaxs∈S ϕ(s). Then, s∗ is a Nash equilibrium of G.
Proof. Let s∗ be an element in argmaxs∈S ϕ(s). Fix i ∈ N and let ti ∈ Si be a profitable
deviation for i, so gi (ti , s∗−i ) > gi (s∗ ). By definition of the potential function,

ϕ(ti , s∗−i ) − ϕ(s∗ ) = gi (ti , s∗−i ) − gi (s∗ ) > 0

However, this is a contradiction with the fact that s∗ ∈ argmaxs∈S ϕ(s), as (ti , s∗−i ) ∈ S and
gets a higher value for ϕ. Therefore, players cannot have profitable deviations and we conclude
that s∗ is a Nash equilibrium.

We conclude that potential games are one of those games in which we always have the existence
of a Nash equilibrium in pure strategies (assuming that ϕ has a maximum).

23
4.1 Best-reply dynamic
The existence of Nash equilibria is not the only nice result obtained from having a potential
function. Indeed, it also gives a nice way of computing a Nash equilibrium. For being more pre-
cise, the presence of a potential function guarantees the convergences of the best-reply dynamic
to a Nash equilibrium.
Algorithm 1: Best-reply dynamic
1 Input: G = (N, (Si )i∈N , (gi )i∈N ) a N -player game, s ∈ S a strategy profile
2 repeat
3 for i ∈ N do
4 Fixing s−i compute ti ∈ argmaxsi ∈Si gi (si , s−i ) and replace si ← ti
5 until Convergence;
Intuitively, the best-reply dynamic consists in replacing the strategy of each player by his best-
reply to the strategies played by the other players. Normally, this dynamic is not guaranteed
to converge since agents may cycle endlessly between their possible strategy profiles. However,
this is not the case for potential games as the value of the potential function increases with each
iteration. Let us prove this result.
Theorem 9. Let G = (N, (SiQ )i∈N , (gi )i∈N ) be a potential game with potential function ϕ.
Suppose that ϕ is bounded in i∈N Si . Then, the best-reply dynamic converges to a Nash
equilibrium after finitely many iterations.
Proof. Let i ∈ N be a player and fix s−i . Let ti ∈ argmaxsi ∈Si gi (si , s−i ) be his best-reply. If
ti is different from si , the current strategy played by i, then gi (ti , s−i ) > gi (s), implying that
ϕ(ti , s−i ) > ϕ(s), by the definition of the function ϕ. Therefore, at each iteration that a player
changes of strategy profile when computing his best-reply, he increases the value of the potential
function. Since ϕ is bounded in S, after finitely many iterations the dynamic must stop. Let s∗
be the output of the dynamic. Since every player is best-replying to the opponents’ strategies,
s∗ is a Nash equilibrium of G.

4.2 Congestion games


Congestion games, sometimes called routing games, are a particular case of potential games
in which players choose routes for going from one point O to another point D. The set of
possible routes is called a routing network and it is represented by a directed graph G = (V, E).
We identify two nodes: O called the origin node, the one only has outgoing edges, and D the
destination node, the one only has ongoing edges.

O
D

Figure 4: Routing network

For each arc e ∈ E we have a cost function ce : R+ → R increasing, i.e. the higher number of
people, higher the cost of using the edge e. We can define the following N -player game:
ˆ N = {1, 2, ..., n} is the set of players,
ˆ For each i ∈ N , Si = {Paths from O to D} = {OD-Routes},

24
ˆ A strategy profile is s = {R1 , R2 , ..., Rn } containing the routes chosen by each player. s can
be seen as the flow of people on the network,
ˆ Given s a strategy profile, we can compute the number of players that choose the route R as

fR (s) = |{i ∈ N : si = R}|

fR corresponds to the flow of players passing through the route R.


ˆ Given a strategy profile s and the flows per routes that s induces, we can compute the flow
of players by each arc as X
fe (s) = fR (s)
R:e∈R

ˆ The total cost of player i is defined by


X
ci (s) = ce (fe (s))
e∈Ri

ˆ Finally, the payoff function of player i is given by gi (s) = −ci (s). In words, players seek to
minimize their travel costs.
Example 16. Pigou Consider the following routing game with n players,

cT (x) = 1

O D

x
CB (x) = n

At each arc the cost depends on the number x of players that choose it. Going by the top arc
has a fixed cost of 1 unit, while the cost of going by the bottom arc is equal to the proportion of
agents that choose it over the total number of agents n. We can find two equilibria:
1. n players choose the bottom path, paying a cost of 1 unit. Indeed, nobody has an incentive
to change of path.
2. One player goes by the top path, while the n − 1 others go by the bottom path. The players
on the bottom do not have an incentive to change as they pass from paying (n − 1)/n to pay
1. Similarly, the player on the top path cannot change to the bottom and decrease his cost,
as the cost of the bottom path becomes 1 when he changes.
Let
P focus on the total cost for a moment, that is, the sum of the costs of all players C(s) =
∗ ∗ ∗
i∈N ci (s). The first equilibrium s has a total cost of C(s ) = n. The second equilibrium t
(n−1)2
n−1
has a total cost of C(t∗ ) = 1 + i=1 n−1 . In particular we find that C(t∗ ) < C(s∗ ).
P
n = 1+ n
Can we find another distribution of the players with an even lower total cost ? Let k be the
number of players that take the bottom path and n − k the players that take the top path. The
total cost of this strategy s(k) is given by,
"   2 #
k k k
C(s(k)) = (n − k) · 1 + k · = n 1 − +
n n n

Minimizing C(s(k)) over k/n we find that its minimum is achieved at k/n = 1/2, that is, half
of the players taking each path. Under this strategy profile the total cost obtained is equal to
C(s(n/2)) = 3n/4. Remark that both equilibria achieve strictly higher total costs.

25
The previous example recall us the discussions about the prisoner’s dilemma. Should we seek
the selfish optimum or what is the best for the entire society?
Example 17. The following routing game is called the Braess’s Paradox, described by the
following graph,

x
1 n

O 0 D

x
n 1

The equilibrium of this game is to take the zig-zag path, the one produces a total cost of 2n, as
each player has a cost of 2. Imagine next that we remove the arc with zero cost in the middle.
The equilibrium changes to the situation in which half of the players take the top path and the
other half the bottom path. Consequently, each player obtains a cost of 3/2 and the total cost
becomes 3n/2. Contrary to the intuition, taking out the free arc decreases the total cost when
players move under equilibrium.
We have said that congestion games are potential games. Let prove it formally.
Theorem 10. Let G = (N, V, E, (ce )e∈E ) be a congestion game. Then, G is a potential game.
Proof. It is enough with finding a potential function for this game. Let ϕ : RE → R given by,
fe
XX
ϕ(f ) = ce (k)
e∈E k=1

where f ∈ RE is the flow in each arc e ∈ E given by fe (s) = R:e∈R fR (s) and fR (s) = |{i ∈
P
N : si = R}|, and s is the strategy profile played by the N players. Fix the chosen routes
of all players and suppose that i ∈ N changes from R to R′ . Let compute ci (f ′ ) − ci (f ) and
ϕ(f ′ ) − ϕ(f ), where f is the flow before the change and f ′ the flow after the change.
X X X
ci (f ′ ) − ci (f ) = 0+ ce (fe + 1) − ce (fe )
e∈R∩R′ e∈R′ \R e∈R\R′

Regarding ϕ, notice that,


fe
X X X fX
e +1 e −1
X fX

ϕ(f ) = ce (k) + ce (k) + ce (k)
e∈R∩R′ k=1 e∈R′ \R k=1 e∈R\R′ k=1
fe
X X fe
X X fe
X X
ϕ(f ) = ce (k) + ce (k) + ce (k)
e∈R∩R′ k=1 e∈R′ \R k=1 e∈R\R′ k=1
X X X

=⇒ ϕ(f ) − ϕ(f ) = 0+ ce (fe + 1) − ce (fe )
e∈R∩R′ e∈R′ \R e∈R\R′

and we obtain that ci (f ′ ) − ci (f ) = ϕ(f ′ ) − ϕ(f ).

Corollary 1. Let G = (N, V, E, (ce )e∈E ) be a congestion game. Then, G has always a Nash
equilibrium in pure strategies.
Proof. Since G is a potential game, any maximum of ϕ over the set of strategy profiles is a Nash
equilibrium. Since the set of feasible flows (strategy profiles) is finite (we have finitely many
options of splitting the players in the graph), ϕ always has a maximum over S. We conclude
that G always has a Nash equilibrium in pure strategies.

26
4.3 Price of anarchy and price of stability in congestion games
Let continue with the study of the total cost of a given flow in a routing game. Let G =
(N, V, E, (ce )e∈E ) be a congestion game and let s ∈ S be a strategy profile, the one induces a
flow f . Previously we defined the total cost of f as,
X
C(f ) := ci (f )
i∈N

However, this expression can be rewritten as it follows,


X XX
C(f ) := ci (f ) = ce (fe (s))
i∈N i∈N e∈Ri
X
changing the order of the sums → = ce (fe (s))|i ∈ N : e ∈ Ri |
e∈E
X X
= ce (fe (s)) |i ∈ N : si = R|
e∈E R:e∈R
X
= fe (s)ce (fe (s))
e∈E

We define two particular costs:


1. C+ (EQ) := max{C(f ∗ ) : f ∗ is an equilibrium flow} as the total cost achieved in equilibrium.
2. C(OP T ) := min{C(f ) : f is any flow} as the minimum ever cost we can achieve.
Then, we define the Price of anarchy of the congestion game G as,
C+ (EQ)
PoA(G) :=
C(OP T )
Notice that PoA≥ 1 always. Recall Pigou’s routing game example,

cT (x) = 1

O D

x
CB (x) = n

The maximum total cost under equilibrium is achieved when all players take the bottom path
with a total cost of n, while the minimum ever total cost is 3n/4, achieved when half of the
players take each path. Therefore, the PoA of this game is equal to 4/3.
Theorem 11 (Roughgarden). For any network (single O-D pair), if costs are affine i.e. ∀e ∈
E, ce (x) = ae x + be with ae > 0, then 1 ≤ PoA ≤ 4/3.
Imagine the situation in which we can influence the players a bit and maybe help them to
converge to a good Nash equilibrium, so even if they keep playing in equilibrium, they do not
reach necessarily the POA. The ratio between the best Nash equilibrium and the social optimum
is know as the Price of stability (PoS). Let C− (EQ) := min{C(f ∗ ) : f ∗ is an equilibrium flow}.
Then, the PoS of a game is defined as,
C− (EQ)
PoS(G) :=
C(OP T )
Notice that P oA ≥ P oS ≥ 1 always.

27
5 Repeated games
Repeated games represent dynamic interactions in discrete time. These games are played in
stages in which players simultaneously choose an action in their own action set. The selected
actions determine the players’ payoffs at that stage. Then, the players’ payoffs of the repeated
game are obtained as a combination of the players’ stage payoffs.
Repetition opens the door to new phenomena as players may play dominated strategies (not an
equilibrium) in the stage games in order of obtaining a higher payoff in the repeated game. In
one shot games, that is, games played only once, agents have incentives to play Nash equilibria
or they may end up obtaining low payoffs, e.g. in a prisoner’s dilemma with one of the players
deciding to cooperate while his partner takes the rational decision of confessing. The new facet
obtained from the repetition of the game may induce cooperation between agents at each stage
as players can punish their opponents if these lasts deviate from the cooperation path. In
the prisoner’s dilemma example, the prisoner who cooperated and got a low payoff due to the
partner’s confession, knows he cannot trust again and start to confess at every posterior stage.
The possibility of following cooperation paths increases the set of equilibrium payoffs, result
known from long time but still without a clear name of the person who proved.

5.1 T-stage Game


A T -stage game is the dynamic game obtained from repeating T times the same game G, the
one is known as the stage game and T ∈ N is fixed. The payoff of the T -stage game is obtained
as the players’ arithmetic average of their stage payoffs. We represent the repeated game by
GT and the set of the mixed Nash equilibrium payoffs is denoted ET . Can you think in a Nash
equilibrium of GT from the Nash equilibria of G?
Example 18. Consider the stage game G,

L R
T 1,0 0,0
B 0,0 0,1

G has two Nash equilibrium payoffs (1, 0) and (0, 1). Notice that any sequence of strategy profiles
in GT in which at each stage the players play a Nash equilibrium of G, is a Nash equilibrium
of GT . In the 2-stage game for example, playing (T, L) the first stage ant (B, R) the second
stage, is a Nash equilibrium of G2 . Even more, this Nash equilibrium achieves the average payoff
(1/2, 1/2), therefore (1/2, 1/2) ∈ E2 .
Remark 6. Repetition allows the convexification of the equilibrium payoffs.
Example 19. Consider the following stage game G,

C2 D2 E2
C1 3,3 0,4 -10,-10
D1 4,0 1,1 -10,-10
E1 -10,-10 -10,-10 -10,-10

Game G corresponds to a prisoner’s dilemma with an extra row and column in which each player
forces the outcome of the game, independently of the strategy chosen by the partner. The set of
Nash equilibrium payoffs of the stage game is E1 = {(1, 1), (−10, −10)}. Let construct a Nash
equilibrium of the 2-stage game with payoff (2, 2):
1. In the first stage, player 1 plays C1 and player 2 plays C2 .

28
2. In the second stage, player 1 plays D1 if player 2 played C2 at stage 1, and he plays E1
(punishment) otherwise. Similarly, player 2 plays D2 if player 1 has played C1 at stage 1,
and he plays E2 (punishment) otherwise.

None of the players has an incentive from deviating from this strategy as they achieve a lower
payoff due to the punishment. We obtain then, that (2, 2) ∈ E2 . In the general T -stage game,
for any T ≥ 1, we can construct a Nash equilibrium that achieves T T−1 (3, 3) + T1 (1, 1).

Remark 7. Using punishments, repetition may allow cooperation.

Cooperation is not the only remark we can obtain from this second example. Unlike the first
example in which agents pick a strategy per stage independently of the opponents past choices,
players may determine their next action from the previous strategy profiles, as in the second
example. The sequence of strategy profiles played from 0 until the current t, is called a history
of the game of length t. With this in mind, we give the formal model of a T -stage game.

5.2 Model of repeated games


We fix
Q a finite strategic
Q game G = (N, (Si )i∈N , (gi )i∈N ) called the stage game. As usual we use
S = i∈N Si , S−i = j̸=i Sj , and s = (si , s−i ) ∈ S.

Definition 18. A history of length t is defined as a vector (s(1), ..., s(t)) ∈ S t , with s(k) being
the strategy profile played by the players at the stage k. The set of histories of length t is
Ht := S t = S × ... × S (t times). The set of all histories is denoted H := ∪Tt=0 Ht , where by
convention we say H0 = ∅.

Previously we remarked the fact that agents may determine their strategies from the past chosen
actions. In other words, players will pick a strategy from the observed history of the game.

Definition 19. A strategy of player i is a mapping σi : H → ∆(Si ) and it isQcalled a behavior


strategy. We denote Σi the set of all behavior strategies of player i and Σ = i∈N Σi .

The intuition for a behavior strategy is the following: Given a history of the game ht of any
length, player i observes ht and plays σi (ht ), that corresponds to a mixed strategy in ∆(Si ), in
his next stage.

Definition 20. Given T ∈ N, we define the T -stage game GT = (N, (Σi )i∈N , (γiT )i∈N ), with,
T T
!
1X 1X
γiT (σ) = Eσ gi (s(t)) = gi (σ(·))
T T
t=1 i=1

where σ is a behavior strategy that gives, for each stage t = {1, ..., T } and each player i ∈ N ,
the strategy σi (h(t)) ∈ ∆(Si ) played.

A Nash equilibrium in GT corresponds to any behavior strategy in which no agent has an


incentive to deviate, that is, no agent has a better mixed strategy to play at some stage such
that his T -average payoff increases. Since, once fixed T , the sets Σi are all finite, by the Nash
theorem, GT has always a Nash equilibrium.

Example 20. Consider the following prisoner’s dilemma G as stage game,

C2 D2
C1 3,3 0,4
D1 4,0 1,1

29
Let show by induction that, without the presence of punishments, the only equilibrium payoff of
the T -stages game, is the Nash equilibrium payoff of G. For T = 1, this is clear. Assume that for
a fixed T ≥ 1, ET = {(1, 1)} and consider a Nash equilibrium σ = (σ1 (t), σ2 (t))Tt≥1
+1
of the (T +1)-
stage repeated game. Notice that, if we consider the truncated strategy σ = (σ1 (t), σ2 (t))Tt≥2
′ +1
,
that is, the strategy starting from t = 2, we obtain a Nash equilibrium of the T -stage game.
Assuming that player 1 played (x, 1 − x) and player 2 played (y, 1 − y) in the first stage, the
equilibrium payoff of both players after the T + 1-stages is,
1 T
g1T +1 (σ) = (3xy + 4(1 − x)y + 1(1 − x)(1 − y)) +
T +1 T +1
1 T
g2T +1 (σ) = (3xy + 4x(1 − y) + 1(1 − x)(1 − y)) +
T +1 T +1

Since σ is a Nash equilibrium, it must hold that g1T +1 (σ) is greater or equal to the payoff that
player 1 would get, for example, if he played D1 at the first stage, that is,
1 T
g1T +1 (σ) ≥ g1 (D1 , y) +
T +1 T +1
⇐⇒ 3xy + 4(1 − x)y + 1(1 − x)(1 − y) ≥ 4y + 1(1 − y)
⇐⇒ 0 ≥ x =⇒ x = 0

Analogously, we can find that y = 0. This implies that both players played (D1 , D2 ) at stage 1,
and therefore the Nash equilibrium σ of the T -stage game achieves the average payoff (1, 1).

Remark 8. Repeating a finite number of times the prisoner’s dilemma is not enough for ob-
taining the cooperation of the players. Maybe if we repeat it infinitely many times?

5.3 Uniform game


We define the uniform game G∞ as the one obtained when considering GT , a T -stage game,
and taking T → ∞. Notice that Nash’s theorem does not apply anymore to this game. We
extend the set of Nash equilibrium payoffs of the T -stage game to uniform games as it follows.

Definition 21. A strategy profile σ is a uniform equilibrium of G∞ if,

1. ∀ε > 0, σ is an ε-Nash equilibrium of any long enough finitely repeated game, i.e. ∃T0 , ∀T ≥
T0 , ∀i ∈ N ,∀τi ∈ Σi , γiT (τi , σ−i ) ≤ γiT (σ) + ε, and,
2. ((γiT (σ))i∈N )T has a limit γ(σ) ∈ Rn when T goes to infinity.

γ(σ) ∈ Rn is called a uniform equilibrium payoff of the uniform game. The set of uniform
equilibrium payoffs is denoted by E∞ .

Repeating infinitely many times the same game allows the cooperation between players in the
prisoner’s dilemma. The issue when we consider only finitely many stages is that the agents, by
using backward induction, realize that the most rational move is to play the Nash equilibrium
at each step. Indeed, suppose that both agents are cooperation at each stage. Then, a rational
player should deviate at the last stage so he can obtain a higher payoff than cooperating. Since
both agents think the same, the two of them deviate to play the Nash equilibrium at the last
stage. Once knowing this, we can forget the last stage and consider a repeated game with one
stage less. By the same argument, the players confess at the penultimate stage. By induction,
the prisoners end up confessing at every stage. This argument is not valid with a infinite number
of stages as there is not “a last” stage.
For expressing the formal result concerning the possible equilibrium payoffs of an infinitely
repeated game, we need some definitions.

30
Definition 22. We define the set of feasible payoffs as

conv(g(S)) = g(∆(S)) = {g(σ) : σ ∈ ∆(S)}


Q Q
Notice that it is different to consider i∈N ∆(Si ) and ∆( i∈N Si ). For defining the set of
feasible payoffs we use the second one. In particular, the set of feasible payoffs is a bounded
polytope and represents the set of payoffs that can be obtained in any (finitely or infinitely)
repeated game. In particular, it always contains to ET and E∞ .

Definition 23. For each player i ∈ N , the punishment level of i or threat point is,

vi = min
Q max gi (xi , x−i )
x−i ∈ j̸=i ∆(Sj ) xi ∈∆(Si )

vi is also called the independent minmax of player i. It represents the lowest payoff that the rest
of the players can force to i to get. In particular, no rational player should obtain less than his
punishment level. We define the set of individually rational payoffs by

IR := {u = (ui )i∈N ∈ Rn : ui ≥ vi , ∀i ∈ N }

Finally, we define the set of feasible and individually rational payoffs as E = conv(g(S)) ∩ IR.

Given the strategy of all players except i at stage t, we can always construct a strategy for
i such that he obtains at least his punishment level at that stage. Therefore, players cannot
obtain less than their punishment levels at any equilibrium. In consequence, ET and E∞ are
both included in E.
Let us illustrate all the definitions on the previous prisoner’s dilemma: The punishment levels
are v1 = v2 = 1. Then, the feasible and individually rational payoffs are represented in the
following picture by the blue region,
g2

g1
1 4

We are ready to state the Folk theorem.

Theorem 12 (Folk). The following results hold:

1. The set of uniform equilibrium payoffs is the set of feasible and IR payoffs: E = E∞ .
2. If there exists u ∈ E1 such that for each player i ∈ N , ui > vi then, ET −−−−→ E.
T →∞

31
TD Nº1 - Dominant strategies
Felipe Garrido-Lucero

Université Paris Dauphine-PSL

Let G = (N, (Si )i∈N , (gi )i∈N ) be a game and si , ti ∈ Si be two strategies for player i.
ˆ si strictly dominates ti if, ∀s−i ∈ S−i : gi (si , s−i ) > gi (ti , s−i ).
ˆ si weakly dominates ti if, ∀s−i ∈ S−i : gi (si , s−i ) ≥ gi (ti , s−i ).
ˆ si is equivalent to ti if, ∀s−i ∈ S−i : gi (si , s−i ) = gi (ti , s−i ).
- G is solvable if the iterated deletion of strictly dominated strategies outputs a trivial game.
- A strategy profile s ∈ S is Pareto optimal if there is not s′ ∈ S such that gi (s′ ) ≥ gi (s), ∀i ∈ N
and ∃i ∈ N, gi (s′ ) > gi (s).

Q1. Consider the following game,

e f g h
A 6,3 4,4 4,1 3,0
B 5,4 6,5 0,2 5,1
C 5,0 3,2 6,1 4,0
D 2,0 2,3 3,3 6,1

Solve this game using the IDSDS procedure.


Q2. Consider the following game,

L R
T 1,1 0,0
M 1,1 2,1
B 0,0 2,1

Show that we can obtain two different solutions using the iterated deletion of weakly
dominated strategies.
Q3. There are two players. Each player is given an unmarked envelope and asked to put in it
either nothing, or 300 euros, or 600 euros of his own money. A referee collects the envelopes,
opens them, gathers all the money, then adds 50% of that amount (using his own money)
and divides the total into two equal parts which he then distributes to the players.
1. Represent this game frame with two alternative tables: the first table showing in each
cell the amount of money distributed to each player, the second table showing the change
in wealth of each player (money received minus contribution).
2. Suppose that player 1 has some resentment towards the referee and ranks the outcomes
in terms of how much money the referee losses (the more, the better). Meanwhile, player
2 is selfish and greedy and ranks the outcomes in terms of her own net gain. Represent
the corresponding game using a table.
3. Is there a strict dominant strategy for both players?
Q4. Let G = (N, (Si )i∈N , (gi )i∈N ) be a game. Suppose that for any player i ∈ N there exists a
strictly dominant strategy s∗i ∈ Si . Prove, by giving a counterexample, that s∗ := (s∗i )i∈N
is not always Pareto optimal.
TD Nº2 - Zero-sum games
Felipe Garrido-Lucero

Université Paris Dauphine-PSL

- Given a zero-sum game G = (I, J, g), we consider,

ˆ maxmin of G v(g) := sup inf g(i, j) −→ max(min(row))


i∈I j∈J
ˆ minmax of G v(g) := inf sup g(i, j) −→ min(max(column))
j∈J i∈I

- G has a value if v(g) = v(g).


- Indifference principle. Let (x̄, ȳ) be optimal strategies and v be the value of the game. Then,

∀i ∈ I, x̄i > 0 =⇒ ⃗ei Aȳ = v,


∀j ∈ J, ȳj > 0 =⇒ x̄A⃗ej = v

Q1. Let G = (I, J, g) be a zero-sum game. Let w1 , w2 be both values of G. Prove that w1 = w2 .

Q2. Consider the following zero-sum game in strategic form,

L M R
T 2 1 5
M -1 -1 -1
B 0 0 0

Compute the value of the game and the optimal strategies by,

1. Computing the maxmin and minmax of the game


2. Using the IDSDS method

Q3. Consider the following zero-sum game,

L M R
T 3 6 5
M 5 2 6
B 1 0 3

1. Prove that this game does not have a value in pure strategies.
2. Compute the value of the game and the optimal mixed strategies. For this, it may help
you to check first the presence of dominated strategies.

Q4. Consider the following zero-sum game G = ([0, 1], [3, 4], g) where g(x, y) = |x − y|. Can you
find the value and optimal strategies for G ?
TD Nº3 - Zero-sum games
Felipe Garrido-Lucero

Université Paris Dauphine-PSL

-Von Neumann minimax Theorem. Let G = (∆(I), ∆(J), g) be a zero-sum game in mixed
strategies. It always holds,

min max g(x, y) = max min g(x, y)


y∈∆(J) x∈∆(I) x∈∆(I) y∈∆(J)

- Indifference principle. Let (x̄, ȳ) be optimal strategies and v be the value of the game. Then,

∀i ∈ I, x̄i > 0 =⇒ g(i, ȳ) = v,


∀j ∈ J, ȳj > 0 =⇒ g(x̄, j) = v

Q1. Consider the following zero-sum game,

L R
T a b
B c d

1. Suppose it holds c < a ≤ b < d. Prove that the value of this game is always a.
2. Consider that a > c, a > b, d > b and d > c. Show the value of the game is always,
ad − bc
v=
a + d − (b + c)

Q2. Let G = (I, J, g) be a zero-sum game. A saddle point is a strategy profile (s̄, t̄) ∈ S × T
such that for any (s, t) ∈ S × T ,

g(s, t̄) ≤ g(s̄, t̄) ≤ g(s̄, t)

Show that if G has a saddle point (s̄, t̄), then G has a value and the saddle point is a profile
of optimal strategies.

Q3. Consider the rock,paper and scissor game,

R P S
R 0 -1 1
P 1 0 -1
S -1 1 0

1. Prove that this game does not have a value in pure strategies.
2. Argument why this game has a value in mixed strategies. Compute it as well as a pair
of optimal strategies.
TD Nº4 - Nash equilibrium
Felipe Garrido-Lucero

Université Paris Dauphine-PSL

i )i∈N ) be a game with N = {1, ..., n} the set of players, Si the strategy
Let G = (N, (Si )i∈N , (gQ
set of player i and gi : i∈N Si → R the payoff function of player i
A strategy profile s∗ ∈ S is a Nash equilibrium if for any i ∈ N ,
gi (s∗i , s∗−i ) = max gi (si , s∗−i )
si ∈Si

or equivalently considering the best reply functions of the players BRi : S−i → Si such that
BRi (s−i ) = argmaxsi ∈Si gi (si , s−i ) it holds that,
∀i ∈ N, s∗i ∈ BRi (s∗−i )

Q1. Let G = (N, (Si )i∈N , (gi )i∈N ) be a game and let s∗ = (s∗i )i∈N be a Nash equilibrium of G.
Show that none of the strategies in s∗ can be eliminated by the IDSDS.
Q2. Consider the following Prisoner’s dilemma:

Confess Not Confess


Confess -5,-5 0,-15
Not Confess -15,0 -1,-1

Find the Nash equilibrium of this game.


Q3. Find all the Nash equilibria of the following three-player game, in which player 1 selects
rows (a,b, or c), player 2 chooses columns (x,y, or z) and player 3 selects a matrix (either
A or B).

x y z x y z
a 2,0,4 1,1,1 1,2,3 a 2,0,3 4,1,2 1,1,2
b 3,2,3 0,1,0 2,1,0 b 1,3,2 2,2,2 0,4,3
c 1,0,2 0,0,3 3,1,1 c 0,0,0 3,0,3 2,1,0

Table 1: Matrix A Table 2: Matrix B

Q4. Two neighboring countries i = 1, 2, simultaneously choose how many resources (in hours)
to spend in recycling activities ri . The average benefit πi for every dollar spent on recycling
is:
rj
πi (ri , rj ) = 10 − ri + ,
2
and the cost per hour for each country is 4. Country i’s average benefit is increasing
in the resources that the neighboring country j spends on his recycling because a clean
environment produces positive external effects on other countries.

1. Find each country’s best-response function, and compute the Nash equilibrium (r1∗ , r2∗ ).
2. Graph the best-response functions and indicate the pure strategy Nash equilibrium on
the graph.
3. On your previous figure, show how the equilibrium would change if the intercept of
one of the countries’ average benefit functions fell from 10 to some smaller number.
TD Nº5 - Nash equilibrium in mixed strategies
Felipe Garrido-Lucero

Université Paris Dauphine-PSL

Consider G = (N, (Si )i∈N , (gi )i∈N ) a finite game such that each player has a finite number of
pure strategies. For every i ∈ N consider,
 
 X 
Σi = ∆(Si ) = σi : σi (si ) ≥ 0 and σi (si ) = 1
 
si ∈Si

We define the extended game G = (N, (Σi )i∈N , (g i )i∈N ) where,


X X
g i (σ) = g i (σ1 , ..., σN ) = Eσ (gi ) := ... σ1 (s1 ) · ... · σN (sN ) · gi (s1 , ..., sN )
s1 ∈S1 sN ∈SN
Q
A mixed strategy profile σ ∈ Σ := i∈N Σi is a mixed Nash equilibrium of G if it is a Nash
equilibrium of the game G.

Q1. Consider the game in which two firms simultaneously and independently decide whether to
lobby the Congress in favor a particular bill. When both firms (or none of them) lobby,
congress’ decisions are unaffected so both firms get the same payoff. If, instead, only one
of them lobbies, it benefices from the entire policy.

Lobby Not Lobby


Lobby -5,-5 15,0
Not Lobby 0,15 10,10

1. Find the pure Nash equilibria of this game


2. Find the mixed Nash equilibria of this game
3. Graphically represent each player’s best reply function

Q2. Consider the following game with two players and three strategies per player.

L M R
T 3,2 4,3 1,4
M 1,3 7,0 2,1
B 2,2 8,-5 2,0

Find all Nash equilibria of this game.

Q3. Consider the following game,

L R
T 6,0 0,6
B 3,2 6,0

1. Draw every player’s expected utility for a given strategy of his opponent.
2. What is every player’s expected payoff from playing her maxmin strategy?
3. Find every player’s Nash equilibrium strategy (pure and mixed) and their payoffs.
TD Nº6 - Nash equilibrium in mixed strategies
Felipe Garrido-Lucero

Université Paris Dauphine-PSL

Nash equilibria computation


1. Find the pure Nash equilibria of the game by underlining in each column and row the best
reply of each player and look for strategy profiles totally underlined.
2. Assume that agents play mixed strategies x, y ∈ ∆(S1 ) × ∆(S2 ) and impose the indifference
principle (be careful with the pure strategies that may not belong to the support):
g1 (s, y) = g1 (t, y), ∀s, t ∈ S1 : xs , xt > 0
g2 (x, s) = g2 (x, t), ∀s, t ∈ S2 : xs , xt > 0
Solve the system of equations taking care of computing coherent values for x and y. The
solutions correspond to the mixed Nash equilibria of the game.
3. An alternative method to steps 1 and 2 is to compute the best reply functions
BR1 (y) = max g1 (x, y), BR2 (x) = max g2 (x, y)
x∈∆(S1 ) y∈∆(S2 )

and plotting them for obtaining all their intersection points. This also works for counting
the number of Nash equilibria.

Q1. Consider the following game,

L R
T 6,0 0,6
B 3,2 6,0

1. Draw every player’s expected utility for a given strategy of his opponent.
2. What is every player’s expected payoff from playing her maxmin strategy?
3. Find every player’s Nash equilibrium strategy (pure and mixed) and their payoffs.

Q2. Consider two candidates competing for office: Democrat (D) and Republican (R). For sim-
plicity, we assume that voters compare the two candidates according to only one dimension
(e.g. the budget share that each candidate promises to spend on education). Voters’ ideal
policies are uniformly distributed along the interval [0, 1], and each votes for the candidate
with a policy promise closest to the voter’s ideal. Candidates simultaneously and inde-
pendently announce their policy positions. A candidate’s payoff from winning is 1, and
from losing is -1. If both candidates receive the same number of votes, then a coin toss
determines the winner of the election.
1. Show there exists a unique pure Nash equilibrium.
2. Show that with three candidates (democrat, republican, and independent), no pure
strategy Nash equilibrium exists.
Q3. Consider two firms that compete for developing a new product. The benefit of being the
first company to produce the item is 36 million euros. Giving x1 , x2 the efforts made by
each firm, the probability of firm i of being the first developer is given by x1x+x
i
2
. Assume
that both firms have a total production cost equal to their level of effort xi .
1. Compute each firm’s best-reply function.
2. Find a symmetric Nash equilibrium, i.e. x∗1 = x∗2 = x∗ .
TD Nº7 - Potential games
Felipe Garrido-Lucero

Université Paris Dauphine-PSL

A game G = (N, (Si )i∈N , (gi )i∈N ) is said to be a potential game if there exists a function
ϕ : S → R such that, ∀i ∈ N, ∀s ∈ S, ∀ti ∈ Si ,

ϕ(s) − ϕ(ti , s−i ) = gi (s) − gi (ti , s−i )

Algorithm 2: Best-reply dynamic


1 Input: G = (N, (Si )i∈N , (gi )i∈N ) a N -player game, s ∈ S a strategy profile
2 repeat
3 for i ∈ N do
4 Fixing s−i compute ti ∈ argmaxsi ∈Si gi (si , s−i ) and replace si ← ti
5 until Convergence;

Q1. Consider the matching pennies game and apply the best-reply dynamic.

H T
H 1,-1 -1,1
T -1,1 1,-1

Does the dynamic converge ?

Q2. Prove that the following prisoner’s dilemma is a potential game,

C B
C 1,1 4,0
B 0,4 3,3

Q3. Repeat the previous question for the following game of coordination,

T F
T 2,1 0,0
F 0,0 1,2

Q4. Suppose that ϕ1 and ϕ2 are two potential functions of the same game G = (N, (Si )i∈N , (gi )i∈N ).
Prove there exists a constant c ∈ R such that ϕ1 (s) − ϕ2 (s) = c, ∀s ∈ S.
TD Nº8 - Congestion games
Felipe Garrido-Lucero

Université Paris Dauphine-PSL

Q1. Consider the following routing game with two players. The costs in each edge correspond
to the ones if one or two players choose to use that edge. Suppose that player 1 wants to
go from A to C and player 2 from B to D.

2, 5 2, 3

C
A
4, 10
1, 3
D

1. Find a pure Nash equilibrium of this game.


2. Propose a potential function for this game.

Q2. There are three machines 1, 2, 3 used by firms 1 and 2. Firm 1 can produce using machines 1
and 2 or 1 and 3. Firm 2 can produce using machines 1 and 2, 1 and 3, or 2 and 3. Costs for
using machine 1 are 5 and 6 respectively, corresponding to one and two users respectively.
For machine 2 the costs are 3 and 4, and for machine 3 costs are 2 and 5 respectively.

1. Model this example as a graph with costs in its edges.


2. Find a pure Nash equilibrium of this game.
3. Propose a potential function for this game.

Q3. Consider the following routing game with two players 1 and 2, where 1 goes from A to C,
and 2 goes from C to A.

1, 2 2, 3

C
A 2, 8

1. Find the pure Nash equilibria of this game.


2. Propose a potential function for this game.
TD Nº9 - Price of anarchy and Price of stability
Felipe Garrido-Lucero

Université Paris Dauphine-PSL

Given a routing game G = (V, E, N, (ce )e∈E ) we consider,

C+ (EQ) = max{C(f ∗ ) : f ∗ is a Nash equilibrium flow}


C− (EQ) = min{C(f ∗ ) : f ∗ is a Nash equilibrium flow}
C(OP T ) = min{C(f ) : f is any flow}

Then, we define the price of anarchy PoA and price of stability PoS respectively as,
C+ (EQ) C− (EQ)
P oA(G) = , P oS(G) =
C(OP T ) C(OP T )

Q1. Consider the following network in which four players travel: player 1 goes from u to v,
player 2 goes from u to w, player 3 goes from v to w, and player 4 goes from w to v.
x
u v
0

0 x x
x

Compute the price of anarchy and price of stability of this game.

Q2. Consider the non-linear version of Pigou’s example, where p ≥ 1 is a fixed natural number,
and n players travel from O to D.

cT (x) = 1

O D

x p

CB (x) = n

Compute the PoA of this game and study its value as p → ∞.

Q3. Price of anarchy can be defined for more games that just congestion games. Consider a
network formation game defined by a set of agents N and some value α ∈ R+ . Suppose
that agents can create directedP arcs between them. Given E a set of arcs, the cost of each
player is ci (E) = α · deg(i) + j∈N d(i, j), where deg(i) represents the number of outgoing
arcs from i and d(i, j) is the shortest distance between i and j in the undirected graph.
Consider a network formation game with five players.

1. Suppose α < 2. Compute the social optimum for this game.


2. Suppose α ≥ 2. Compute the social optimum for this game.
TD Nº10 - Repeated games
Felipe Garrido-Lucero

Université Paris Dauphine-PSL

Definition 24. For each player i ∈ N his punishment level (or threat point) is,

vi = min
Q max gi (xi , x−i )
x−i ∈ j̸=i ∆(Sj ) xi ∈∆(Si )

Theorem 13 (Folk). The following results hold:

1. The set of uniform equilibrium payoffs is the set of feasible and IR payoffs: E = E∞ .
2. If there exists u ∈ E1 such that for each player i ∈ N , ui > vi then, ET −−−−→ E.
T →∞

Q1. Consider the following prisoner’s dilemma G as stage game,

C2 D2
C1 3,3 0,4
D1 4,0 1,1

1. Show that ET = {(1, 1)} for any T ∈ N.


2. Compute the set of feasible and individually rational payoffs.

Q2. Compute the set of feasible and individually rational payoffs in the following games:

L R L R L R
T 1,1 3,0 T 1,0 0,1 T 1,-1 -1,1
B 0,3 0,0 B 0,0 1,1 B -1,1 1,-1

Q3. Let G be a finite 2-player zero-sum game. What are the equilibrium payoffs of the finitely
repeated game GT and the uniform game G∞ ?

You might also like