Saint Petersburg State University
Department of Mathematical Game Theory and Statistical Decisions
Jiang Xin
Master’s thesis
Dynamic Relational Contracts
Specialization 01.04.02
Applied Mathematics and Informatics
Master’s Program Game Theory and Operations Research
Research advisor
Dr. Sc.(PhD), Professor
Petrosjan L. A.
Saint Petersburg
2016
Contents
1 Abstract
1
2 Introduction
2
3 About game theory
5
3.1
Two-person zero-sum game . . . . . . . . . . . . . . . . . . . . .
5
3.2
Maximin and minimax strategies . . . . . . . . . . . . . . . . .
6
3.3
Saddle point . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
3.4
N -person non-zero-sum game . . . . . . . . . . . . . . . . . . .
9
3.5
Nash equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . .
9
3.6
Multistage game
. . . . . . . . . . . . . . . . . . . . . . . . . .
4 Stationary contracts in two-person case
11
19
4.1
The game model . . . . . . . . . . . . . . . . . . . . . . . . . .
19
4.2
Stationary contracts . . . . . . . . . . . . . . . . . . . . . . . .
22
5 N -person game model
27
5.1
Description of the game . . . . . . . . . . . . . . . . . . . . . .
27
5.2
Tree and history . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
5.3
Cooperative trajectory . . . . . . . . . . . . . . . . . . . . . . .
32
5.4
Core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
5.5
Strong Nash Equilibrium . . . . . . . . . . . . . . . . . . . . . .
40
5.6
Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
6 Conclusion
48
7 References
49
1
Abstract
In our dissertation two different mathematical models of Relational Con-
tracts are considered. One model in two-person case is taken from existing
literature. The most interesting part of our research is the n-person case. We
concentrate on the problem of finding a strong Nash equilibrium in Relational
Contracts (this is done first time in game theory).
We prove that the transfers in the form of side payments in each stage
game can lead to construction of special type of Nash Equilibrium. What’s
more, this Nash equilibrium is a strong one, which means that it is stable
against the deviations of coalitions of players. In organizing stage transfers,
we use imputations from the core.
For a better understanding of results, the introduction of basic definitions
from game theory is needed. This is done in section 3 of dissertation.
Relational contracts, n-person game, strong Nash equilibrium, cooperate
trajectory, core, transfer payment
1
2
Introduction
Contract is the core issue of new institutional economics. The standard
contract theory or principal-agent theory assuming that the contract content
completely clear, and in any state can be confirmed, the implementation of the
law, this is the ideal type of contract. However, the reality is not the case of
contract. Due to the limited rationality, the legal system is not very perfect. The
contract could not estimate all the problems, making the legal system imperfect,
causing difficulties for the execution of the contract. Many are dependent on
legal cooperation of both sides of the transaction and the security strategy,
like mortgage, hostage, triggering strategy, and the reputation. Because of the
conflict between standard contract theory and reality, the scholars have put
forward a new relational contract theory instead of the old concepts.
Relational contracts don’t give the provisions of all the terms in details
during the transactions, only to determine the objectives and basic principles,
hence the past, present and expected future contractual personal relationship
play a key role in the long-term running of the arrangement. Then relational
contracts become self-enforcing informal agreements in lots of long-term relationship in which official contracts are too expensive to form. Sometimes agreements with the character of relational contracts can also be reached between
different countries as no institution is able or willing to supervise the fulfillment
of both sides.
In such a case, a money transfer plays an essential role in the relationship.
A money transfer that maintains the process of the relational contract can be
in form of tax, fine, or some other reimbursement. The complete performance
of the relational contract, including the money transfer, will lead to expected
2
interests satisfying all participants.
Of course, we have to think about the possible bad outcomes. Since the
relational contract performance is largely linked to personal interests and then,
when participants find their own interests are failed to meet, or said that access
to more benefits seemingly, probably they will choose to betray the original relational contract content. Under this circumstance, the other participants of the
contract, of course, will take appropriate response measures, which are usually
called punishment, for safeguard of their own interests and suppressing the benefit of the betrayal. Because of the existence of such punishment mechanism,
the people in the normal circumstances will not choose to violate the content of
the contract.
The above mentioned complex process can be described by an infinite
repeated game perfectly, which is pretty convenient for us to make some furthermore researches about the properties of the relational contracts. In my
dissertation, I specialize this case in which relational contract is made with
required money transfer and supported by punishment strategies. Driven by rational minds, the cooperation of all the participants will make the whole game
process tend to dynamic stable state.
The existence and form of stationary contracts in infinitely repeated twoplayer games have been discussed earlier in [Susanne Goldlücke, 2013] [13]. In
our paper we also explore contracts in an n-person infinite-stage game, which is
more general.
This dissertation is organized as follows. In section 2 we give some knowledge about the background of relational contract. In section 3 we introduce
basic definitions and theorems from game theory. In section 4 we introduce the
3
stationary contracts in infinitely repeated two-player games. Section 5 is our
main part, in which we describe our own game model and then define the Strong
Nash Equilibrium in an n-person infinite-stage game. In the end of section 5
we give the proof of the existence of Strong Nash Equilibrium and the form of
Strong Nash Equilibrium strategies. In section 6 we analyze the advantages and
disadvantages of our methods compared with the above mentioned approach.
4
3
3.1
About game theory
Two-person zero-sum game
Definition 3.1 We call system
Γ = (X, Y, K)
(3.1)
a standard two-player zero-sum game, where X and Y are non-empty sets, K :
X × Y → R1 is a real-valued function.
In game Γ, elements x ∈ X and y ∈ Y are called strategies of player 1
and 2 respectively. Elements of Cartesian product X × Y ( namely the strategy
doublet (x, y), where x ∈ X and y ∈ Y ) are called situations, and function K
is the payoff function of player 1. Under situation (x, y) the payoff of player 2
is (−K(x, y)) , so K is called the payoff function of game Γ and Γ is called zero
sum game. To give a game Γ , we need to define the strategy sets X and Y
of player 1 and 2, and simultaneously define the payoff function K on situation
X ×Y .
Game Γ can be explained as: players simultaneously and independently
choose their strategies x ∈ X and y ∈ Y , then player 1 gets payment K(x, y),
and player 2 gets payment (−K(x, y)).
Definition 3.2 Game Γ0 = (X 0 , Y 0 , K 0 ) is called the subgame of game Γ =
(X, Y, K), if X 0 ⊂ X and Y 0 ⊂ Y . Function K 0 : X 0 × Y 0 → R1 is the constraint
of function K on X 0 × Y 0 .
Suppose in game (3.1) player 1 has m choices of strategies. Sort the strategy set X of the first player, namely construct a one-to-one mapping function
5
between set M = {1, 2, ..., m} and X. Similarly if player 2 has n strategies we
can construct a one-to-one mapping function between set N = {1, 2, ..., n} and
Y . Then game Γ can be can be completely determined by matrix A = {aij },
where aij = K(xi , yj ), (xi , yj ) ∈ X × Y, i ∈ M, j ∈ N . The game is carried out in
the following manner: player 1 chooses row i ∈ M and player 2 chooses column
j ∈ N . Both of them make their choices simultaneously and independently.
Hence player 1 gets payoff aij and player 2 gets payoff (−aij ). If the payment
is a negative number, we can regard it as the actual loss of a player.
Example 2.1 Player 1 and 2 choose integers i and j from set {1, 2, ..., n}. Player
1 wins |i − j| and the game is a zero-sum game. The payoff matrix is an n × n
matrix in which aij = |i − j|. When n = 5, the payoff matrix is
3.2
A=
0 1 2 3 4
1 0 1 2 3
2 1 0 1 2
3 2 1 0 1
4 3 2 1 0
Maximin and minimax strategies
Consider a two-player zero-sum game Γ = (X, Y, K), in which each player
maximizes his own payoff through choosing strategy. The payoff of player 1
is determined by function K(x, y), and the payoff of player 2 is (−K(x, y)),
namely the purposes of the players are exactly opposite. Note that the payoff
of player 1(2) is determined by the situation (x, y) ∈ X × Y formed in game. In
every situation, the payment of one player in game is not only determined by
6
his own choice, but also depends on what strategy his opponent chooses. The
opponent’s purpose and his purpose are directly opposite, so when trying to get
maximum payment, each player must take into account each other’s behavior.
In game theory we suppose that strategies of all players are rational.
Namely player 1 devotes himself to get maximum payoff in the case of player 2 choosing his own most favorable strategy. What can player 1 guarantee for
himself? Suppose player 1 chooses strategy x, then in the worst case he will
get min K(x, y). So what player 1 can always guarantee for himself is payoff
y
max min K(x, y). If giving up the assumption of extreme value, then player can
x
y
always make his payoff infinite approach to numerical value
g = sup inf K(x, y)
(3.2)
x∈X y∈Y
which we call the lower value of the game. If external extreme value is reached,
then g is also called maximin value. The principle that constructs strategy x
on the base of maximizing the minimum payoff is called maximin principle, and
the strategy made according to this principle is called maximin strategy.
We have the same discussion for player 2. If player 2 chooses strategy y,
then in the worst case he will lose max K(x, y). So player 2 can always guarantee
x
that he loses no more than min max K(x, y).
y
x
We call
ḡ = inf sup K(x, y)
(3.3)
y∈Y x∈X
the upper value of the game. If external extreme value is reached, then ḡ is also
called minimax value. The principle that constructs strategy y on the base of
minimizing maximum loss is called minimax principle, and the strategy made
according to this principle is called minimax strategy. It should be empha7
sized that the existence of maximin (minimax) strategy is determined by the
reachability of external extreme value in formula (3.2) (or (3.3)).
3.3
Saddle point
Consider the optimal behavior problem in a two-player zero-sum game.
In game Γ = (X, Y, K), we call the situation (x∗ , y ∗ ) ∈ X × Y is optimal
if any player can’t get benefit by deviating from it. The optimal principle of
constructing equilibrium situation is called equilibrium principle. We’ll see later
that for a two-player zero-sum game this equilibrium principle is equivalent to
minimax and maximin principle. Of course we need the existence of equilibrium,
which means a equilibrium principle is achievable.
Definition 3.3 In a two-player zero-sum game Γ = (X, Y, K), situation (x∗ , y ∗ )
is called saddle point, if for all x ∈ X and y ∈ Y
K(x, y ∗ ) 6 K(x∗ , y ∗ )
K(x∗ , y) > K(x∗ , y ∗ )
Definition 3.4 Provided (x∗ , y ∗ ) is a saddle point of game Γ then we call
g = K(x∗ , y ∗ )
(3.4)
the value of the game.
Theorem 3.1 In game Γ = (X, Y, K), the sufficient and necessary condition
for the existence of saddle point is that there exist minimax value and maximin
value
min sup K(x, y), max inf K(x, y)
y
x
x
8
y
which satisfy equality
g = max inf K(x, y) = min sup K(x, y) = ḡ
x
3.4
y
y
x
N -person non-zero-sum game
Definition 3.5 We call
Γ = (N, {Xi }i∈N , {Hi }i∈N )
non-cooperative game, where N = {1, 2, ..., n} is the set of players, Xi is the
strategy set of player i, Hi is player i’s payoff function which is defined on
n
Y
Cartesian product X =
Xi of all players’ strategy sets.
i=1
The n-person non-zero-sum game is carried out in the following manner.
Players simultaneously and independently choose their strategies from their strategy sets Xi , i = 1, 2, ..., n. And then situation x = (x1 , x2 , ..., xn ), xi ∈ Xi is
formed. Each player gets payoff Hi (x) and the game ends.
If the pure strategy set Xi of player i is finite, then we call the game finite
n-person non-cooperative game.
3.5
Nash equilibrium
Suppose that in game Γ each player tries to realize the situation which
can maximize his payment. But payoff function Hi (x) depends on not only
the behavior of player i, but also the behaviors of other players in the game.
So the situation that brings maximum payoff for player i may be not the best
situation for other players. As in zero-sum game each player tries to get his
maximum payment with conflicting characteristics, describing the best behavior
9
is a problem. Here we have many possibilities to formulate the meaning of
optimality one of which is Nash equilibrium ( and its transformations). In zerosum game, Nash equilibrium is consistent with the optimal principle in zero-sum
game.
Suppose x = (x1 , ..., xi−1 , xi , xi+1 , ..., xn ) is an arbitrary situation in game
Γ, xi is a strategy of player i. If player i’s strategy xi is changed to xi 0 , situation
x will be changed to (x1 , ..., xi−1 , xi 0 , xi+1 , ..., xn ), which is written by (x||xi 0 ).
Obviously if xi equals to xi 0 , then (x||xi 0 ) = x.
Definition 3.6 We call situation x∗ = (x1 ∗ , ..., xi ∗ , ..., xn ∗ ) Nash equilibrium, if
for all xi ∈ Xi and i = 1, 2, ...n the following inequality holds
Hi (x∗ ) > Hi (x∗ ||xi )
(3.5)
By the definition of Nash equilibrium, we can see that any player i alone
will not be interested in deviation from the strategy xi ∗ constituting Nash equilibrium x∗ . Because when the remaining players continue to choose strategies
constituting Nash equilibrium x∗ , replacing the behavior xi ∗ by xi can only reduce his payment. In this case, if all players reach an agreement in advance
that they will choose the strategies constituting equilibrium situation, then the
individual deviation from the contract is unfavorable to the side of the deviation.
However, an important characteristic of Nash equilibrium is that the deviation of two or more players may cause that one of deviators gets increased
payment. Suppose S ⊂ N is a subset (coalition) of the set of all players and
x = (x1 , x2 , ..., xn ) is a situation in game Γ. If coalition S changes their strategy
xi , i ∈ S to x0i ∈ Xi , i ∈ S, then situation x turns into (x||x0S ). (x||x0S ) means
coalition S takes strategy xS and other players still follow strategy x. If x∗ is
10
Nash equilibrium, according to (3.5), generally we shouldn’t have
Hi (x∗ ) > Hi (x∗ ||xS )
(3.6)
Through satisfying (3.6) we can strengthen the concept of Nash equilibrium.
Definition 3.7 We call situation x∗ strong Nash equilibrium, if for an arbitrary
Y
Xi , the following inequality is satisfied
coalition S ⊂ N and xS ∈
i∈S
X
∗
Hi (x ) >
i∈S
X
Hi (x∗ ||xS )
(3.7)
i∈S
Of course, arbitrary strong Nash equilibrium is Nash equilibrium.
3.6
Multistage game
In order to define the finite multi-stage n-person game with complete
information, we need some basic knowledges of graph theory.
Suppose X is a finite set and f is a rule corresponding each element x ∈ X
to element f (x) ∈ X. X is also called a single-valued mapping from X to X or
a function that defines on X and values on X. A set-valued mapping F from X
to X is a rule corresponding element x ∈ X to some subset Fx ∈ X (including
the case Fx = ∅). To make it clear, the following multiple-value mappings are
also expressed in terminologies of mapping.
Suppose F is a mapping from X to X and A ⊂ X. The image of set A is
set
FA =
[
x∈A
11
Fx
According to the definition above, suppose F (∅) = ∅. We can prove if
Ai ⊂ X, i = 1, 2, ..., n, then
[n
[n
\n
\n
F(
Ai ) =
F Ai , F (
Ai ) ⊂
F Ai
i=1
i=1
i=1
i=1
Define F 2 , F 3 , ..., F K , ... in following way:
Fx2 = F (Fx ), Fx3 = F (Fx2 ), ..., Fxk = F (Fxk−1 ), ...
The mapping F̂ from set X to X is called the transfer coverage of mapping,
if
F̂x = {x} ∪ Fx ∪ Fx2 ∪ · · · ∪ Fxk ∪ · · ·
Mapping F −1 is the inverse mapping of mapping F , defined as
Fy−1 = {x|y ∈ Fx }
Similar to mapping Fxk , we can define mapping (F −1 )ky
(F −1 )2y
=F
−1
(F
−1
)y
(F −1 )3y = F −1 ((F −1 )2y , · · · , (F −1 )ky = F −1 ((F −1 )k−1
y
If B ⊂ X, then suppose
F −1 (B) = {x|Fx ∩ B 6= ∅}
Definition 3.8 We call tuple (X, F ) a graph, if X is an finite set and F is a
mapping from set X to the set of subsets of X.
Denote graph (X, F ) by G. The elements of set X are denoted by points
12
on the plane. A pair of points x and y(y ∈ Fx ) are connected by an arrowed
continuous curve (from x to y). Then each element of set X is called a node of
the graph and element tuple (x, y) is called an arc of graph (y ∈ Fx ). For arc
p = (x, y), we call nodes x and y boundary nodes of the arc if x is the initial
node and y is the end node.
Denote the set of arcs on graph by P . The set of arcs on graph G = (X, F )
determines mapping F . On the contrary, mapping F determines set P . So graph
G can be described as G = (X, F ) or G = (X, P ).
We call sequence of arcs p = (p1 , p2 , ..., pk , ...) a path in graph G = (X, F ),
in which the end node of a previous arc corresponds to the initial node of a
subsequent arc. The length of a path p = (p1 , p2 , ..., pk , ...) is l(p) + 1, the
number of nodes concluded in the sequence of arcs. Specially l(p) = ∞ for an
infinite path.
The set consisting of two elements x, y ∈ X is called edge of G = (X, F ),
here if (x, y) ∈ P or (y, x) ∈ P . Different from arcs, the direction of edges is not
important. Denote edges by p, q, and similarly we denote the set of all edges
by P . The sequence of edges p = (p1 , p2 , ..., pk , ...) is called chain, in which
a boundary node of each edge pk is a boundary node of pk−1 and the other
boundary node is a boundary node of pk+1 .
Circle is a finite chain that starts from one node and ends at the same
node. We call a graph connected if any two arbitrary nodes of the graph can be
linked by a chain.
According to definition, a tree or tree graph is a finite connected graph
which has at least two nodes and no circles. In an arbitrary tree graph there
exists only one node x0 that makes F̂x0 = X. We call node x0 the root of graph
13
G.
Figure (3.1) is a tree or tree graph with a root x0 . Denote nodes x ∈ X
of the tree graph in natural way and arcs by arrowed segments.
Figure 3.1: Tree
Suppose z ∈ X, denote the subgraph Gz of tree graph G = (X, F ) by
(Xz , Fz ), Xz = F̂z , Fz x = Fx ∩ Xz . In figure (3.2) the dotted line circles a
subgraph starting from node z. In the tree graph, for all x ∈ Xz , set Fx is
consistent with set Fz x, namely mapping Fz is restriction of mapping F on set
Xz . Hence denote the subgraph of the tree graph by Gz = (Xz , F ).
14
Figure 3.2: Subgraph of tree
Then we define multistage game with complete information on tree graph.
Suppose G = (X, F ) is a tree graph, we divide the node set X into n+1 sets
X1 , ..., Xn , Xn+1 , ∪n+1
i=1 Xi = X, Xk ∩ Xl = ∅, k 6= l. Here for x ∈ Xn+1 , we have
Fx = ∅. We call set Xi , i = 1, 2, ..., n the set of personal positions of player i and
set Xn+1 terminal state set. Define n functions H1 (x), H2 (x), ..., Hn (x), x ∈ Xn+1
on terminal state set Xn+1 . Function Hi (x), i = 1, 2, ..., n is called payoff of
player i.
Hence the game progresses in following manner. Players are indexed by
set N (N = {1, 2, ..., n}). Suppose x0 ∈ Xi1 , then player i1 takes behavior at
node x0 and chooses x1 ∈ Fx0 . If x1 ∈ Xi2 , then player i2 takes behavior at node
x1 and chooses x2 ∈ Fx1 . In this way, if on stage k the node occurs xk−1 ∈ Xik ,
15
then player ik takes behavior and chooses subsequent node from set Fxk−1 . Once
a terminal state node xl ∈ Xn+1 is reached, for which Fxl = ∅, the game ends.
Sequential choices of states can uniquely realize a node sequence x0 , ..., xk , ..., xl ,
which determines a path from root x0 to a terminal state node in graph G. Because of the tree structure of the graph G, every path uniquely determines the
accessible terminal state xl . On the contrary the terminal state xl uniquely
determines the path. At node xl each player i gets payoff Hi (xl ).
Suppose player i knows node x when finishing choice at node x ∈ Xi .
Because of the tree structure of graph G, he can restore all the states before.
And then we say that players have perfect information.
Definition 3.9 We call the single-valued mapping from each node x ∈ Xi to a
node y ∈ Fx the strategy of player i.
Denote by Ui the set of all possible strategies of player i. At an arbitrary
node x of player i’s personal positions Xi , the strategy of player i determines
an uniquely choice for the next state.
Sequence u = (u1 , ..., ui , ..., un ) is called a situation of the game, where
n
Y
ui ∈ Ui and U =
Ui is called the set of situations. Each situation u =
i=1
(u1 , ..., ui , ..., un ) uniquely determines a path and the payoffs of players. In fact,
provided x0 ∈ Xi1 , under situation u = (u1 , ..., ui , ..., un ) the next state x1 will
be determined uniquely by rule ui1 (x0 ) = x1 . Provided x1 ∈ Xi2 , the next state
x2 will be determined uniquely by rule ui2 (x1 ) = x2 . If state xk−1 ∈ Xik , then
xk will be determined uniquely by rule xk = uik (xk−1 ), and so on.
If situation u = (u1 , ..., ui , ..., un ) corresponds to a path x0 , x1 , ..., xl in the
sense above, then we can introduce the concept of payoff function Ki for player
i. The value of payoff function under each situation u equals to the value of
16
Hi at the terminal state of the path x0 , x1 , ..., xl corresponding to situationu =
(u1 , ..., ui , ..., un ) namely
Ki (u1 , ..., ui , ..., un ) = Hi (xl ), i = 1, 2, ..., n
Function Ki , i = 1, 2, ..., n is defined on the situation set U =
n
Y
Ui . After
i=1
constructing the strategy set Ui and payoff function Ki , i = 1, 2, ..., n, we get a
game in normal form
Γ = {N, {Ui }i∈N , {Ki }i∈N }
Here N = {1, ..., i, ..., n} is the set of players, Ui is the strategy set of player i
and Ki is the payoff function of player i, i = 1, 2, ..., n.
In order to do some further study of strategic behavior in Γ, we need to
introduce the concept of subgame, namely game on subgraph of basic game
graph G.
Suppose z ∈ X, consider subgraph Gz = (Xz , F ) which corresponds to
subgame Γz in the following way. In subgame Γz , the set of personal position of
a player is determined by rule Yiz = Xi ∩ Xz , i = 1, 2, ..., n. Terminal state set
z
is Yn+1
= Xn+1 ∩ Xz . The payoff Hiz (x) of player i in subgame is
z
Hiz (x) = Hi (x), x ∈ Yn+1
, i = 1, 2, ..., n
Hence in subgame Γz the strategy uzi of player i is defined as restriction
of player i’s strategy of game Γ on set Yiz , namely
uzi (x) = ui (x), x ∈ Yiz = Xi ∩ Xz , i = 1, 2, ..., n
In subgame the strategy set of player i is denoted by Uiz . Then each
17
subgraph is corresponds to a subgame in normal form
Γz = (N, {Uiz }, {Kiz })
where payoff function Kiz , i = 1, 2, ..., n is defined on Cartesian product U z =
n
Y
Uiz .
i=1
18
4
4.1
Stationary contracts in two-person case
The game model
In the game we have already introduced, players know all information
before current stage. In current stage, they make their own behaviors one by
one. But now we’re going to introduce a slightly different game. In this game,
players only know what happened before current stage. They have no idea
about what their opponents are doing in the happening stage as they make
simultaneously behaviors.
Consider an infinitely repeated two-player game Γ with perfect information. Player are indexed by i, j ∈ {1, 2}. Each game happens in a period t,
and we denote the game as Γt . Every period t contains two substages: a payment stage and a play stage. Players make a money transfer to each other in a
payment stage and they choose behaviors simultaneously.
Denote the continuous payoff function in stage game of the play stage by
g(a) = (g1 (a1 , a2 ), g2 (a1 , a2 )) : A1 × A2 → R × R
where the set Ai is the compact action space of player i, a = (a1 , a2 ) is the
action profiles of this stage game and gi (a1 , a2 ) is the payoff of player i. Denote
by A = A1 × A2 the set of all action profiles. We denote the sum of payoffs of
two players from an action a = (a1 , a2 ) by
G(a) = g1 (a) + g2 (a)
19
The maximal payoffs or cheating payoffs of player 1 and 2 are given by
c1 (a) = max g1 (a)
a
c2 (a) = max g2 (a)
a
Each period starts from a payment stage. In a payment stage each player
makes a money transfer to the other player. Here we assume that each player
has enough amount of money so the situation that money is used up would not
be considered. We denote by pij the money transfer from player i to player j.
Thus we have players’ net payment
p1 = p12 − p21
p2 = p21 − p12
Obviously we have p1 = −p2 and we generally describe net payments of
players by vector p = (p1 , p2 ). So that player i’s payoff in a period with net
payments p = (p1 , p2 ) and action profile a is equal to gi (a) − pi .
Given a path of the game starting from time τ :
(pτ , aτ , pτ +1 , aτ +1 , ...)
The average discounted continuation payoff of player i is
(1 − δ)
∞
X
δ t−τ (gi (at ) − pi t )
t=τ
And given a path (aτ , pτ +1 , aτ +1 , ...) that starts from a play stage, the
20
function is
(1 − δ)
∞
X
δ t−τ (gi (at ) − δpi t+1 )
t=τ
Denote by ui the strategy of player i. As the definition before, ui tells
player i what to do on each stage. To make a decision correct, a strategy ui
needs to have a very important relationship with the history before the certain
stage of the game.
A history ending before stage k ∈ {pay, play} in period t is a list of all
payments and actions which have been made by players before stage k. Denote
by hk all the histories that end before stage k.
Each decision in each position should be made by the player after taking
the current history into consideration. According to this, we can describe ui ,
the strategy of player i, in this way:
Definition 4.1 The strategy of player i is a function of history before any stages
in the game and tells player i to choose an appropriate behavior on the consequent stage:
ui (hk ) = ai , k = pay , ai ∈ Ai
ui (hk ) = pi , k = play
We write ui |h for the strategy profile of player i following history h. Hence
gi (ui |h) describes player i’s average discounted payoff by following strategy ui |h.
g(u|h) = (g1 (u1 |h), g2 (u2 |h)) is the vector of continuation payoffs, where u =
(u1 , u2 ). And G(u|h) = g1 (u1 |h) + g2 (u2 |h) denote the joint continuation payoff.
By SP E we mean the set of subgame perfect (continuation) equilibriums
that start in stage k. If u is a subgame perfect equilibrium, we call g(u) a
21
subgame perfect payoff. The set of subgame perfect payoffs at stage k is denoted
by
GkSP E = {g(u) : u ∈ SP E}
Note that we restricted the analysis to pure strategies. We assume that
the stage game has a Nash equilibrium in A, which is a strong requirement.
4.2
Stationary contracts
We define a stationary strategy profile by a triple of action profiles (ae , a1 , a2 ),
which is also called action plan, and a payment plan (p0 , pe , F 1 , F 2 , f 1 , f 2 ). The
progresses are as follows.
In the beginning of the game (period 0), players get up-front payments
p0 = (p01 , p02 ).
If both players make the expected payment on the payment stage, the
equilibrium action ae = (ae1 , ae2 ) will be played on the next play stage.
Whenever player i unilaterally deviates from a prescribed action, he must
pay a fine Fii > 0 to player j on the subsequent payment stage.
If player i deviates from a prescribed payment, as punishment action ai =
(ai1 , ai2 ) will be made on the next play stage and payment f i = (f1i , f2i ) will be
made on the subsequent payment stage.
We can see progress of stationary strategy proles in figure (4.1).
22
Figure 4.1: Path
If player i unilaterally deviates from the prescribed path, then player i’s
punishment path will follow. In this game, a stationary strategy profile consists
of the equilibrium path (p0 , ae , pe , ae , pe , ...) and two types of punishment paths
for each player i, depending on whether the deviation happens on play stage or
on payment stage: (F i , ae , pe , ae , pe , ...) or (ai , f i , ae , pe , ae , pe , ...).
Any unilaterally deviation from the equilibrium path will be punished by
the same continuation equilibrium, which is the worst possible subgame perfect
equilibrium for the deviated player. In this game, the punishment paths have
such a structure: action ai must have a low enough cheating payoff ci (ai ) to
prevent a deviation from player i. The payment fi i is used as a lower fine that
adjusts the punishment so that neither the punished player i nor the punishing
player j has a motivation to deviate from the punishment prole ai .
There are two different punishment paths because a punishment can hap23
pen in the play or payment stage.
As it is mentioned in the earlier paper [Abreu. 1988.] [1], the punished
player can have the same payoff in two kinds of punishment paths if we define
the lower fine fii as
fii
=
Fii
uii − gi (ai )
−
δ
(4.1)
Fixing the lower fine like this doesn’t restrict the possibility of stationary
strategy proles to characterize optimal subgame perfect. Similarly, we will assume that an action plan can always fulll the following two conditions for the
players:
1.G(ae ) > G(ai ), i = 1, 2
2.ci (ai ) < ci (ae ), i = 1, 2
Definition 4.2 A stationary strategy profile that constructs a subgame perfect
equilibrium is called a stationary contract.
In the following we introduce conditions that lead to subgame perfection of
a stationary contract profile. We denote the continuation payoffs of two players
on the equilibrium path by
ue1 = g1 (ae ) − δpe1
ue2 = g2 (ae ) − δpe2
The continuation payoffs of two players if player i deviates are given by
ui1 = −(1 − δ)F1i + ue1
ui2 = −(1 − δ)F2i + ue2
To proof that a stationary contract is a subgame perfect equilibrium, we
24
need to check that there are no profitable deviations for both players. If player
i complies with his punishment, his payoff will be uii no matter which stage the
punishment starts in. If he deviates once and complies afterwards, his payoff
will be uii or (1 − δ)ci (ai ) + δuii , depending on which stage the punishment starts
in. So player i will not deviate from his punishment if
uii > ci (ai )
Let us see the circumstance of player j in punishment of player i. First
we need to ensure that player j doesn’t deviate from the punishment profile ai .
Besides, he pays the reward in case fji > 0. It can be checked easily that both
conditions are fulfilled if and only if
(1 − δ)G(ai ) + δG(ae ) − uii > (1 − δ)cj (ai ) + δujj
Left side of equation is the payoff of player j when player i is punished on
the play stage. The right side is his payoff if he cheats on the play stage.
If the game proceeds on the equilibrium path, the actions ae and payments
pe are achieved if and only if for player 1 and player 2
ue1 > (1 − δ)c1 (ae ) + δui1
ue2 > (1 − δ)c2 (ae ) + δui2
An up-front payment p0 is subgame perfect if for player 1 and player 2 the
following conditions are satisfied
−(1 − δ)p01 + ue1 > ui1
−(1 − δ)p02 + ue2 > ui2
25
Theorem 4.1 There exists a stationary contract with action plan (ae , a1 , a2 ) if
and only if
G(ae ) > (1 − δ) (c1 (ae ) + c2 (ae )) + δ c1 (a1 ) + c2 (a2 )
(4.2)
and for player 1 and player 2
(1 − δ)G(a1 ) + δG(ae ) − c1 (a1 ) > (1 − δ)c2 (a1 ) + c2 (a2 )
2
e
2
2
(4.3)
1
(1 − δ)G(a ) + δG(a ) − c2 (a ) > (1 − δ)c1 (a ) + c1 (a )
The condition (4.2) ensures that both player1 and player 2 don’t want to
deviate from the equilibrium action profile ae . The joint payoff by following the
equilibrium path G(ae ) must be larger than the joint payoffs from cheating now
and being punished in the future. Condition (4.3) ensures that if player i is
punished no player wants to deviate from the punishment of player i.
26
5
5.1
N -person game model
Description of the game
Consider about an n-person infinite-stage game Γ. Players are indexed
by i ∈ N = {1, 2, ..., n} and in each stage they choose their behaviors simultaneously.
On the first stage of the game they select an action a1 =
(a11 , a12 , ..., a1n ) simultaneously and independent of each other, and in the second stage each player make transfer payments to all the other players respectively, which we denote by a whole part p2 = (p21 , p22 , ..., p2n ). Then the similar
actions are repeated on the later stages: in odd stages, simultaneous actions
)(k = 1, 2, ...) will be made, and in even stages bea2k−1 = (a2k−1
, a2k−1
, ..., a2k−1
n
1
2
2k
2k
haviors in form of payments p2k = (p2k
1 , p2 , ..., pn )(k = 1, 2, ...) will be realized.
The sequence of behaviors in the game will be:
a1 , p2 , a3 , p4 , ..., a2k−1 , p2k , ...(k = 1, 2, ...)
), where a2k−1
is the behavior
Here a2k−1 = (a2k−1
, a2k−1
, ..., ai2k−1 , ..., a2k−1
n
1
2
i
of player i on stage 2k − 1, a2k−1
∈ Ai , Ai is infinite behavior space of player i.
i
We denote the payoff of player i on stage 2k − 1 by gi2k−1 (a2k−1 ), then we have
the joint payoff of all players on stage 2k − 1:
G2k−1 (a2k−1 ) =
X
gi2k−1 (a2k−1 )
i∈N
2k
Denote by p2k
i the net transfer of player i on stage 2k, and pij is the
27
transfer from player i to player j on stage 2k, we have
p2k
i
=
X
p2k
ij
−
X
p2k
ji
i6=j
i6=j
and
2k
2k
2k
p2k = (p2k
1 , p2 , ..., pi , ..., pn )
It should be noted here that there is a constraint for p2k
ij :
06
n
X
2k−1 2k−1
p2k
(a
)
ij =gi
j=1
which indicates that the player i can’t give the others more than what he got
in the previous stage.
5.2
Tree and history
The n-person infinite-stage game Γ can be described more clearly by an in-
finite directed graph. Here we construct an infinite tree graph T . Denote all the
nodes of this tree by X which is divided into sets X 0 , X 1 , X 2 , ..., X 2k−1 , X 2k ...(k =
1, 2, ...). X 2k−1 (X 2k ) is a set of possible positions that can be realized on stage
2k − 1(2k). And nodes x0 , x1 , x2 , ..., x2k−1 , x2k ...(k = 1, 2, ...) are actually positions in all stages formed in the game, x2k−1 ∈ X 2k−1 , x2k ∈ X 2k .
The number of arcs which have a node xi as their initial node is called the
outdegree of node xi , and similarly the number of arcs which have xi as their
final node is called the indegree of node xi .
From figure (5.1) we can see the indegree of each node is one, but all nodes
can have a larger outdegree, meaning each node connected with x2k−1 except
x2k−2 can possibly be next position x2k . The most important thing is that every
28
position is reached by simultaneous behaviors of all players in the stage. That is
to say, x2k−1 corresponds to a2k−1 and x2k corresponds to p2k . Specially X 0 = x0
means the initial root.
Figure 5.1: Tree
Hence game Γ progresses in the following manner: at x0 , the beginning of
the game, all players take actions a1 = (a11 , a12 , ..., a1i , .., a1n ) together and arrive
29
at node x1 ∈ X 1 . Then payments
p2 = (p21 , p22 , ..., p2i , ..., p2n )
which lead to x2 ∈ X 2 will be made, where p2i =
X
p2ij −
i6=j
X
p2ji .
i6=j
Like this, in odd stages players take actions as
a2k−1 = (a2k−1
, a2k−1
, ..., a2k−1
, ..., a2k−1
)
n
1
2
i
leading to x2k−1 and in even stages they transfer money by payments
2k
2k
2k
p2k = (p2k
1 , p2 , ..., pi , ..., pn )
that lead to x2k .
It’s important for us to understand how players make decisions and thus
choose the next position. Above of all, let’s see what a strategy is in this game.
We have already given the definition of strategy in multistage game by definition
(3.9). Here it will be very similar.
For player i, he has a strategy ui following which he can know what to
do in all possible position in the game. So a strategy ui not only include the
actually decisions in the game but also tells player i to choose a correct next
behavior in all possible nodes of the tree. To make a decision correct, a strategy
ui need to have a very important relationship with the history before the certain
stage of the game.
A history progressing before stage 2k − 1(2k) is a list of all actions and
payments which have occurred before this stage. Let h2k−1 (h2k ) be the history
that progress before stage 2k − 1(2k). A history means a path, indicating that
30
each node except x0 comes with a unique path, along which we can arrive the
position starting from root x0 .
Like the example in the following figure (5.2), we see clearly that the
history of x5 is the bold path.
Figure 5.2: History
Each decision in each position should be made by the player after taking
31
the current history into consideration. According to this, we can describe ui ,
the strategy of player i, in this way:
Definition 5.1 The strategy of player i is a function of history before any stages
in the game and tells player i to choose an appropriate behavior on the consequent stage:
ui (h2k−1 ) = a2k−1
, ui (h2k ) = p2k
i
i
(5.1)
Then (u1 (h2k−1 ), u2 (h2k−1 ), ..., un (h2k−1 )) form behavior a2k−1 on stage 2k−
1 and direct to x2k−1 . Similarly (u1 (h2k ), u2 (h2k ), ..., un (h2k )) form behavior p2k
on stage 2k and direct to x2k .
5.3
Cooperative trajectory
Since the players following their strategies they can make wise next actions
which bring most benefit to them.
Definition 5.2 We call ã = (ã1 , p̃2 , ã3 , p̃4 , ..., ã2k−1 , p̃2k , ...) cooperative trajectory if the following condition is satisfied:
∞ X
X
[δ 2k−2 gi2k−1 (ã2k−1
) + δ 2k−1 p̃2k
i ]
i
k=1 i∈N
= max
a2k−1
p2k
i
i
∞ X
X
[δ 2k−2 gi2k−1 (a2k−1
) + δ 2k−1 p2k
i ] (5.2)
i
k=1 i∈N
where
2k
2k
2k
ã2k−1 = (ã2k−1
, ã2k−1
, ..., ãi2k−1 , ..., ã2k−1
), p̃2k = (p̃2k
n
1 , p̃2 , ..., p̃i , ..., p̃n )
1
2
Actually the cooperative trajectory formed in the game is a path which
maximizes the collective interest, which requires the conscious cooperation of
32
all players in the game Γ. On the other hand, the cooperative trajectory ã =
(ã1 , p̃2 , ã3 , p̃4 , ..., ã2k−1 , p̃2k , ...) is also directed by strategies of all players.
Respectively if x̃0 , x̃1 , x̃2 , ..., x̃2k−1 , x̃2k ...(k = 1, 2, ...) are nodes consisting
the cooperative path, so we can denote history like
h̃2k−1 = x̃0 , x̃1 , x̃2 , ..., x̃2k−1 (k = 1, 2, ...)
h̃2k = x̃0 , x̃1 , x̃2 , ..., x̃2k−1 , x̃2k (k = 1, 2, ...)
or
h̃2k−1 = ã1 , p̃2 , ã3 , p̃4 , ..., ã2k−1 (k = 1, 2, ...)
h̃2k = ã1 , p̃2 , ã3 , p̃4 , ..., ã2k−1 , p̃2k (k = 1, 2, ...)
In the following figure (5.3) the bold line is the cooperative trajectory.
33
Figure 5.3: Cooperative trajectory
As Γ is an n-person infinite-stage game, we denote subgames by Γ2k−1 and
Γ2k . Subgames are functions of histories before respectively stages. Γ2k−1 =
Γ(h2k−1 ) means the subgame starts from stage 2k − 1, while Γ2k = Γ(h2k )
describes the subgame since stage 2k. Obviously here we have Γ1 = Γ(h1 ) = Γ.
Besides we describe each stage game as γ 2k−1 = γ(h2k−1 ) and γ 2k = γ(h2k ).
34
Denoting the cumulative payoff of player i since stage l by Cil , we can give
the payoff of player i in two types of stage games and cumulative payoff in two
types of subgames.
In stage game γ 2k−1 the payoff of player i is gi2k−1 (a2k−1
) and in stage game
i
γ 2k the payoff should be
−p2k
i
X
X
2k
p2k
pij −
= −(
ji )
i6=j
i6=j
Hence we have the cumulative payoff in subgame Γ2τ −1 (τ = 1, 2, ...):
Ci2τ −1
=
=
∞
X
k=τ
∞
X
X
X
[δ 2k−2 gi2k−1 (ai2k−1 ) − δ 2k−1 (
p2k
−
p2k
ij
ji )]
i6=j
i6=j
(5.3)
[δ 2k−2 gi2k−1 (ai2k−1 ) − δ 2k−1 p2k
i ]
k=τ
And the cumulative payoff in subgame Γ2τ (τ = 1, 2, ...):
Ci2τ
=
=
∞
X
k=τ
∞
X
X
X
2k 2k+1 2k+1
[−δ 2k−1 (
p2k
−
p2k
(ai )]
ij
ji ) + δ gi
i6=j
i6=j
(5.4)
2k 2k+1 2k+1
[δ 2k−1 (−p2k
(ai )]
i ) + δ gi
k=τ
5.4
Core
So far we have discussed normal form of our game and the cooperative
trajectory, which can be called ”relational contract”. As we say in the first part
of this paper, since the relational contract performance is highly related with
personal interests and when players find their own interests are failed to meet,
or said that access to more benefits, they will choose to betray the original
relational contract contents. In this case, the other participants of the contract,
35
of course, will take appropriate response measures, which are usually called
punishment, for safeguard of their own interests and suppressing the benefit of
the betrayal.
In our case contract is cooperation, and here we start to consider the
deviation from the cooperation trajectory in the game.
If a coalition S(S ⊂ N ) deviates from the cooperative trajectory ã =
(ã1 , p̃2 , ã3 , p̃4 , ..., ã2k−1 , p̃2k , ...) , immediately the coalition N \S will play against
them in the next stage and keep punishment till the end, which forms a zero-sum
game Γ̄(S, N \S, K).
Suppose a saddle point exists in pure strategy game γ̄ 2k−1 (S, N \S):
∗ 2k−1 ∗ 2k−1
K(a2k−1
, a∗ 2k−1
, a N \S )
N \S ) 6 K(a S
S
∗ 2k−1 2k−1
K(a∗ 2k−1
, a∗ 2k−1
, aN \S )
S
N \S ) > K(a S
here γ̄ 2k−1 (S, N \S) is a stage game of zero-sum game Γ̄(S, N \S, K).
, a∗ 2k−1
Definition 5.3 We call K(a∗ 2k−1
S
N \S ) the value of characteristic function of
stage game γ̄ 2k−1 (S, N \S):
v 2k−1 (S) = K(a∗ 2k−1
, a∗ 2k−1
S
N \S ), (k = 1, 2, ...)
Specially when S = N, v 2k−1 (S) = v 2k−1 (N ), and here v 2k−1 (N ) is the
value of maximal joint payoff of all players.
Although the players of coalition S get more interests by deviating in one
stage, but due to the punishment taken by coalition N \S the coalition S can’t
get more than v 2k−1 (S)(k = 1, 2, ...) in each odd stage later.
Denote by D(S, l) the joint payoff of coalition S by deviating on stage l,
G(S, l) the joint payoff of coalition S if they cooperate with others on stage l.
36
Assumption 5.1 The players can only get finite benefit from deviating in stage
l(l = 1, 2, , ...) which means
sup |D(S, l) − G(S, l)| = M, 0 < M < +∞, S ⊂ N, l(l = 1, 2, , ...)
(5.5)
l
Provided on stage 2k − 1 player i gets share αi2k−1 in maximal joint payoff
v 2k−1 (N ).
Definition 5.4 We call vector α2k−1 = (α12k−1 , α22k−1 , ..., αn2k−1 ) an imputation
of the stage game γ̄ 2k−1 (S, N \S) if the following conditions are satisfied
αi2k−1 > v 2k−1 ({i}), i ∈ N
N
X
αi2k−1 = v 2k−1 (N )
(5.6)
(5.7)
i=1
Here v 2k−1 ({i}) is the value of characteristic function for a single element
coalition S = {i} in stage game γ̄ 2k−1 (S, N \S).
Condition (5.6) is called an individual rationality condition, which means
that each player can at least gets as much as he will earn when taking a behavior
alone regardless of the cooperation with other players. The condition (5.7)
N
X
should also be satisfied. Because when
αi2k−1 < v 2k−1 (N ) there exists an
imputation α
02k−1
αi2k−1 . And if
N
X
i=1
that can make every player i, i ∈ N gets more than share
αi2k−1 > v 2k−1 (N ), the player in N will share an impossibly-
i=1
achieved payoff which makes α2k−1 unrealizable. So α2k−1 is available only when
condition (5.7) is hold. Condition (5.7) is also called a collective rationality
condition.
Definition 5.5 We say that the imputation α2k−1 dominates β 2k−1 through
37
coalition S (denoted by α2k−1 <S β 2k−1 ), if
αi2k−1 > βi2k−1 , i ∈ S
(5.8)
α2k−1 (S) 6 v 2k−1 (S)
(5.9)
Condition (5.8) indicates that for every member of coalition S α2k−1 is
better than β 2k−1 . And condition (5.9) ensures the achievement of imputation
α2k−1 for coalition S (namely coalition S can actually offer share αi2k−1 for every
player i ∈ S).
Definition 5.6 We say that the imputation α2k−1 dominates β 2k−1 if there exists a coalition S satisfying α2k−1 <S β 2k−1 . Denote that imputation α2k−1 dominates β 2k−1 by α2k−1 <β 2k−1 .
Definition 5.7 A set of imputations which can’t be dominated in a cooperative
game is called the core of the game.
Suppose the core of subgame Γ2k−1 is not void, and denoted by C(Γ2k−1 ).
A best imputation should be made for every player according to some
rules of the initial relational contract.
Theorem 5.1 The sufficient and necessary condition that an imputation α2k−1 =
(α12k−1 , α22k−1 , ..., αn2k−1 ) belongs to the core of the game is
v 2k−1 (S) 6 α2k−1 (S) =
X
αi2k−1 , S ⊂ N, k = 1, 2, ...
(5.10)
i∈s
The proof is following.
First we see the sufficiency. Provided a imputation α2k−1 (k = 1, 2, ...)
satisfies the condition (5.10), we need to prove that α2k−1 belongs to the core.
38
If α2k−1 doesn’t belongs to the core, there will exist a imputation β 2k−1 making
β 2k−1 <S α2k−1 , namely β 2k−1 (S) > α2k−1 (S) and β 2k−1 (S) 6 v 2k−1 (S), which is
contradictory to condition (5.10).
Then we prove the necessity. For any imputation α2k−1 that doesn’t satisfy
condition (5.10), there exists a coalition S satisfying α2k−1 (S) 6 v 2k−1 (S) .
Take
βi2k−1
=
αi2k−1
βi2k−1
v 2k−1 (S) − α2k−1 (S)
+
,i ∈ S
|S|
v 2k−1 (N ) − v 2k−1 (S)
=
,i ∈
/S
|N | − |S|
Here |S| denotes the number of players in coalition S. Obviously β 2k−1 (N ) =
v 2k−1 (N ), βi2k−1 > 0, and β 2k−1 <S α2k−1 . So α2k−1 doesn’t belong to the core.
The theorem above has been proved.
But here we need a stricter condition. Suppose S ⊂ N , we have:
X
αi2k−1 > v 2k−1 (S), S ⊂ N, k = 1, 2, ...
(5.11)
i∈s
Combining with the definition of cooperative trajectory, we have
X
i∈N
αi2k−1 = G2k−1 (ã2k−1 ) = max
a
X
gi2k−1 (a) =
i∈N
X
gi2k−1 (ã2k−1 )
(5.12)
i∈N
On stage 2k player i transfers p2k
ij > 0 to player j, so we get a transfer
matrix P 2k = {p2k
ij }. Actually here we have
2k−1 2k−1
δ p̃2k
(ã
) − αi2k−1
i = gi
(5.13)
gi2k−1 (ã2k−1 ) is the payoff of player i on stage 2k − 1 if he follows the cooperative trajectory. αi2k−1 is the deserved part for player i from the imputation
39
α2k−1 , which can be simply understood as how much he would hold after transferring money to the other players. That is to say, the amount of p̃2k
i is fixed by
2k
gi2k−1 (ã2k−1 ) and α2k−1 . So the transfers p2k
ij and pji between every two players
need to meet a result of p̃2k
i . Namely according to (5.13) the elements of transfer
matrix P 2k must be defined as any solution of
gi2k−1 (ã2k−1 )
−
αi2k−1
= δ(
X
j:j6=i
p2k
ij
X
−
p2k
ji )
j:j6=i
It is easy to see that such a matrix solution always exists and here we can
give an example
P 2k
5.5
=
0
..
.
1
..
.
···
...
1
..
.
1
1
···
0
· · · 1 − δ p̃n2k−1
1 − δ p̃2k
1 − δ p̃2k
1
2
1
..
.
1
0
Strong Nash Equilibrium
By definition (3.7) we have already known strong Nash equilibrium in gen-
eral case. Given an n-person non zero-sum game Γ, the strong Nash equilibrium
ũ = (ũ1 , ũ2 , ..., ũn ) is such n-tuple of strategies for which the following condition
takes place:
X
i∈S
gi (ũ) >
X
gi (ũN \S , uS )
i∈S
gi (ũ) is the payoff of player i if coalition S follows strategies ũ = (ũ1 , ũ2 , ..., ũn )
together with other players, and gi (ũN \S , uS ) is the payoff of player i if coalition
S takes strategies us instead while other players keep following ũ. In our event
we still use ũ = (ũ1 , ũ2 , ..., ũi , ..., ũn ) to describe the strong Nash equilibrium and
40
ũi is strong Nash Equilibrium strategy of player i. The cooperative trajectory
can be derived from:
ũi (h̃2k−1 ) = ã2k−1
, ui (h̃2k ) = p̃2k
i
i
where h̃2k−1 (h̃2k ) is the history before ã2k−1 (p̃2k ) of the cooperative trajectory.
Player i makes a behavior ãi2k−1 (transfer p̃2k
i ) by following the strategy
ũi . And then the decisions in this stage of all players form ã2k−1 (p̃2k ), which
contributes to the cooperative trajectory ã = (ã1 , p̃2 , ã3 , p̃4 , ..., ã2k−1 , p̃2k , ...) step
by step. Once the players in the game find some coalition S deviating from the
cooperative trajectory determined by the relational contracts, then they start
the zero-sum game Γ̄(S, N \S, K) and the punishment strategies tell them to
response to control the payoff of coalition S no more than v 2k−1 (S) in each
subsequently odd stage.
We can see the process more clearly in flow chart (5.4).
Now we can derive the condition under which the strategy profile {ã, G}
constitutes a strong Nash equilibrium. We have the following theorem
Theorem 5.2 The cumulative payoff of player i in form of
Ci1
=
∞
X
[δ 2k−2 gi2k−1 (ã2k−1
) − δ 2k−1 p̃2k
i ]
i
k=1
can be attained in strong Nash equilibrium if
−1 2k−1 2k−1
p̃2k
(ã
) − αi2k−1 ]
i = δ [gi
and
α2k−1 = (α12k−1 , α22k−1 , ..., αn2k−1 ) ∈ C(Γ2k−1 )
41
Figure 5.4: Flow chart of process
42
5.6
Proof
The game consists of two kinds of stages, action stages and payment
stages, so the situation will be different as the coalition S choosing to deviate in different stages. First let’s see the deviating in payment stage.
Suppose the coalition S deviates from strategies {α̃, G} on stage 2l, the
transfer to other players from S will be zero since stage 2l (p2l
ij = 0, i ∈ S, j ∈
N \S). Then on the next stage and till the end of the game the coalition N \S
will play against them in zero-sum game Γ̄(S, N \S, K). The coalition N \S not
only takes respectively response in odd stages later but also stops to transfer
money to coalition S in subsequent even stages.
And what the players from S can guarantee in each later odd stage is only
v 2k+1 (S)(k > l), the value of the characteristic function of stage game γ̄S2k+1 .
Here γ̄S2k+1 is the stage game of zero-sum game Γ̄(S, N \S, K).
Deviating in payment stage of stage 2l means coalition S gets D(S, 2l) =
X X
X X
p2l
p2l
without
paying
the
expected
transfer
p̄
=
ij > 0 to the
ji
i∈S j∈N \S
i∈S j∈N \S
coalition N \S. The total possible maximum payment of coalition S since stage
2l if they deviate in payment stage 2l is
δ
2l−1
D(S, 2l) + δ
2l
∞
X
δ 2k−2l v 2k+1 (S)
k=l
But the payoff of players from S when moving along cooperative trajectory
on stage 2l is equal to
G(S, 2l) =
X X
i∈S j∈N \S
p2l
ji −
X X
p2l
ij
i∈S j∈N \S
According to (5.4) and (5.13) the payment of coalition S since stage 2l
43
without deviating is
δ
2l−1
G(S, 2l) + δ
2l
∞
X
(δ
2k−2l
X
αi2k+1 )
i∈S
k=l
Here α2k+1 = (α12k+1 , α22k+1 , ..., αn2k+1 ) ∈ C(Γ2k+1 ).
To proof the theorem we need to find δ ∈ (0, 1) which satisfies
δ
2l−1
G(S, 2l) + δ
2l
∞
X
(δ
2k−2l
X
αi2k+1 )
i∈S
k=l
>δ
2l−1
D(S, 2l) + δ
2l
∞
X
δ 2k−2l v 2k+1 (S)
k=l
So we have
δ
∞
X
X
δ 2k−2l (
αi2k+1 − v 2k+1 (S)) > D(S, 2l) − G(S, 2l)
i∈S
k=l
δ
∞
X
k=l
δ
2k−2l
X
(
αi2k+1 − v 2k+1 (S)) > p̄
i∈S
Here α2k+1 = (α12k+1 , α22k+1 , ..., αn2k+1 ) belongs to the core C(Γ2k+1 ), so we
X
have
αi2k+1 > v 2k+1 (S), S ⊂ N, k = 1, 2, ... according to (5.11).
i∈s
X
Denote L = inf [
αi2k+1 −v 2k+1 (S)] > 0.
k
i∈S
We have
δ
p̄
>
1 − δ2
L
Because p̄ > 0, L > 0, and lim−
δ→1
δ
= +∞, so we can always find a
1 − δ2
44
δ → 1 − 0 (δ close to 1) that satisfies
δ
2l−1
G(S, 2l) + δ
2l
∞
X
(δ 2k−2l
k=l
X
αi2k+1 )
i∈S
>δ
2l−1
D(S, 2l) + δ
2l
∞
X
δ 2k−2l v 2k+1 (S)
k=l
Now we consider the case that the coalition S deviates in action stage
2l − 1.
Instead of ã2l−1
⊂ ã2l−1 , the coalition S takes action aS0 2l−1 . The coalition
S
S can get
X
0 2l−1
[gi2l−1 (ã2l−1
)]
N \S , aS
i∈S
Denote
D(S, 2l − 1) = max
aS0 2l−1
X
2l−1 0 2l−1
[gi2l−1 (ãN
)]
\S , aS
i∈S
Since stage 2l the coalition S doesn’t pay the expected transfer payment
X X
p2l
p̄ =
ij > 0 to coalition N \S. Of course no transfer comes from
i∈S j∈N \S
coalition N \S and new action will be taken on stage 2l + 1 by them to play
against the deviated coalition S till the end. Then the total payoff of coalition
S after deviation from the stage 2l − 1 on will not exceed
δ
2l−2
D(S, 2l − 1) + δ
2l
∞
X
δ 2k−2l v 2k+1 (S)
k=l
According to (5.3) and (5.13) the payoff of players of S since stage 2l − 1
without deviating is
δ
2l−2
G(S, 2l − 1) + δ
2l
∞
X
k=l
45
(δ
2k−2l
X
i∈S
αi2k+1 )
Here α2k+1 = (α12k+1 , α22k+1 , ..., αn2k+1 ) ∈ C(Γ2k+1 ).
And to finish the proof we need to find δ ∈ (0, 1) that satisfies
δ
2l−2
G(S, 2l − 1) + δ
2l
∞
X
k=l
(δ
2k−2l
X
αi2k+1 )
i∈S
>δ
2l−2
D(S, 2l − 1) + δ
2l
∞
X
δ 2k−2l v 2k+1 (S)
k=l
α2k+1 = (α12k+1 , α22k+1 , ..., αn2k+1 ) belongs to the core C(Γ2k+1 ), so we have
X
i∈s
αi2k+1 > v 2k+1 (S), S ⊂ N, k = 1, 2, ... according to (5.11).
X
Denote L = inf [
αi2k+1 −v 2k+1 (S)] > 0.
k
i∈S
Hence we have
δ
2
∞
X
δ 2k−2l L > D(S, 2l − 1) − G(S, 2l − 1)
k=l
Denote M = sup[D(S, 2l − 1) − G(S, 2l − 1)], 0 < M < +∞ according to
l
(5.5).
We have
δ2
M
>
1 − δ2
L
r
1>δ>
46
M
L+M
r
So there always exists a δ ∈ (
δ
2l−2
G(S, 2l − 1) + δ
2l
∞
X
(δ
2k−2l
M
, 1) satisfying
L+M
X
αi2k+1 )
i∈S
k=l
>δ
2l−2
D(S, 2l − 1) + δ
2l
∞
X
δ 2k−2l v 2k+1 (S)
k=l
In summary, we can always find a δ → 1 − 0 (δ close to 1) that satisfies
the following two inequalities
δ
2l−1
G(S, 2l) + δ
2l
∞
X
(δ 2k−2l
X
αi2k+1 )
i∈S
k=l
>δ
2l−1
D(S, 2l) + δ
2l
∞
X
δ 2k−2l v 2k+1 (S)
k=l
δ
2l−2
G(S, 2l − 1) + δ
2l
∞
X
k=l
(δ 2k−2l
X
αi2k+1 )
i∈S
>δ
2l−2
D(S, 2l − 1) + δ
2l
∞
X
δ 2k−2l v 2k+1 (S)
k=l
The above two inequalities mean that no matter deviating in an action
stage or a payment stage the coalition S can’t get more benefit than following
the cooperative trajectory.
The theorem has been proved.
47
6
Conclusion
In section 4 we have introduced the two-person game model and derived
the condition for a stationary contract. We improve the model from section 4
and get some better results in n-person infinite-stage game in section 5. Compared with the old method, our new model has three advantages.
First, we give the form of strong Nash equilibrium and prove the existence
of strong Nash equilibrium. Strong Nash equilibrium is stable against deviation
of any coalition.
Besides, the new model doesn’t need the existence of Nash equilibrium in
each stage game. In the discussion of section 3, there is one strong requirement
that the stage game has a Nash equilibrium.
The old model in section 4 only discusses stationary contracts in twoperson case. In my new model, an n-person game has been successfully constructed and analyzed, which is obviously more general.
However there is also a disadvantage in my dissertation. In the proof we
don’t consider the deviation from punishment which makes the problem more
complicated. Maybe we will do some research on this special case in our further
study.
48
7
References
[1] Abreu, Dilip. On the theory of Infinitely Repeated Games with Discounting.
Econometrica, 1988, 383-396.
[2] Ayoshin D, Tanaka T. The Core and the Dominance Core in Multichoice Multistage Games with Coalitions in a Matrix Form. Proceedings
of NACA98 (International Conference on Nonlinear Analysis and Convex
Analysis), 1998.
[3] Baker, George, Gibbons, Robert, Murphy, Kevin J. Relational Contracts
and the Theory of the Firm. Quart. J. Econ. 2002. 2: 39-84.
[4] Kaitala V, Pohjola M, Tahvonen O. Transboundary Air Pollition between
Finland and the USSR: A Dynamic Acid Rain Game// Lecture Notes in
Control and Information Sciences. Vol.157 (Proceedings of the Fourth International Symposium on Differential Games and Applications). Springer,
1991: 183-192.
[5] Karlis S. Mathimatical Methods and Theory in Games, Programming and
Economics. London: Pergamon Press, 1959.
[6] Nicos Christofides. Graph Theorem. Academic Press, 1978.
[7] Petrosyan L A, Gao Hongwei. Dynamic Cooperative Game. China: Scientific Press, 2009.
[8] Petrosyan L A. Differential Games of Pursuit. Singapore: World Scientific,
1933.
49
[9] Petrosyan L A. Ayoshin D A. Values of Dynamic games with Partical Cooperation. Proceedings of the Insititute of Mathematics and Mechanics,
Ekaterinburg, 2000, 6(1, 2): 160-172.
[10] Petrosyan L A. Ayoshin D A, Tanaka T. Construction of a Time Consistent
Core in Multichoice Multistage Games. Decision Theory and Its Related
Fields, RIMS Kokyuroku, 1998, 1043: 198-206.
[11] Petrosyan L A. The Regularization of NB-scheme in Differential Games,
Dynamics and Control, 1995, 5: 31-35.
[12] Petrosyan L A, Zaccour G. Time-consistent Shaply Value Allocation of
Pollution Lost Reduction. Journal of Economic Dynamic & Control, 2003,
27: 381-398.
[13] Susanne Goldlücke, Sebastian Kranz. Renegotiation-proof relational contracts. Games and Economic Behavior, 2013, Vol.80: 157-178.
[14] Strotz R H. Myopia and Inconsistency in Dynamic Utility Maximization.
Review of Economic Studies, 1955, 23: 165-180.
[15] Yeung D W K, Petrosyan L A. Cooperative Stochastic Differential Games.
New York: Springer-Verlag, 2006.
[16] Yeung D W K, Petrosyan L A, Lee M. Dynamic Cooperation: A Paradigm
on the Cutting Edge of Game Theory. Beijing: China Market Press, 2007.
50
Отзывы:
Авторизуйтесь, чтобы оставить отзыв