Динамика реляционных контрактов

Цзян Синь

Динамика реляционных контрактов

В нашей диссертации рассматриваются две различные математические модели реляционных контрактов. Одна модель в случае двух игроков берется из существующей литературы. Самой интересной частью нашего исследования является случай многих лиц. Мы концентрируемся на задаче о нахождении сильного равновесия Нэша в реляционных договорах (это сделано впервые в теории игр). Доказано, что трансферты в виде побочных платежей на каждом этапе игры могут привести к построению особого типа равновесия по Нэшу. Более того, это равновесие Нэша является сильным, а это значит, что оно устойчиво по отношению к отклонениям коалиций игроков. При организации трансфертов на каждом шаге игры мы используем дележи из ядра. Для лучшего понимания результатов, требуется введение основных определений из теории игр. Это делается в разделе 3 диссертации.

Общественные науки в целом

Диссертации

Вуз: Санкт-Петербургский государственный университет (СПбГУ)

ID: 587d366c5f1be77c40d58efa

UUID: 85241a38-5a88-4f65-9c16-1f13756c786a

Язык: Русский

Опубликовано: около 8 лет назад

Просмотры: 4

Цзян Синь

Источник: Санкт-Петербургский государственный университет

Комментировать 0

Рецензировать 0

Скачать - 2128810 bytes

Поделиться работой

Saint Petersburg State University Department of Mathematical Game Theory and Statistical Decisions Jiang Xin Master’s thesis Dynamic Relational Contracts Specialization 01.04.02 Applied Mathematics and Informatics Master’s Program Game Theory and Operations Research Research advisor Dr. Sc.(PhD), Professor Petrosjan L. A. Saint Petersburg 2016

Contents 1 Abstract 1 2 Introduction 2 3 About game theory 5 3.1 Two-person zero-sum game . . . . . . . . . . . . . . . . . . . . . 5 3.2 Maximin and minimax strategies . . . . . . . . . . . . . . . . . 6 3.3 Saddle point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.4 N -person non-zero-sum game . . . . . . . . . . . . . . . . . . . 9 3.5 Nash equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.6 Multistage game . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Stationary contracts in two-person case 11 19 4.1 The game model . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Stationary contracts . . . . . . . . . . . . . . . . . . . . . . . . 22 5 N -person game model 27 5.1 Description of the game . . . . . . . . . . . . . . . . . . . . . . 27 5.2 Tree and history . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.3 Cooperative trajectory . . . . . . . . . . . . . . . . . . . . . . . 32 5.4 Core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.5 Strong Nash Equilibrium . . . . . . . . . . . . . . . . . . . . . . 40 5.6 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6 Conclusion 48 7 References 49

1 Abstract In our dissertation two different mathematical models of Relational Con- tracts are considered. One model in two-person case is taken from existing literature. The most interesting part of our research is the n-person case. We concentrate on the problem of finding a strong Nash equilibrium in Relational Contracts (this is done first time in game theory). We prove that the transfers in the form of side payments in each stage game can lead to construction of special type of Nash Equilibrium. What’s more, this Nash equilibrium is a strong one, which means that it is stable against the deviations of coalitions of players. In organizing stage transfers, we use imputations from the core. For a better understanding of results, the introduction of basic definitions from game theory is needed. This is done in section 3 of dissertation. Relational contracts, n-person game, strong Nash equilibrium, cooperate trajectory, core, transfer payment 1

2 Introduction Contract is the core issue of new institutional economics. The standard contract theory or principal-agent theory assuming that the contract content completely clear, and in any state can be confirmed, the implementation of the law, this is the ideal type of contract. However, the reality is not the case of contract. Due to the limited rationality, the legal system is not very perfect. The contract could not estimate all the problems, making the legal system imperfect, causing difficulties for the execution of the contract. Many are dependent on legal cooperation of both sides of the transaction and the security strategy, like mortgage, hostage, triggering strategy, and the reputation. Because of the conflict between standard contract theory and reality, the scholars have put forward a new relational contract theory instead of the old concepts. Relational contracts don’t give the provisions of all the terms in details during the transactions, only to determine the objectives and basic principles, hence the past, present and expected future contractual personal relationship play a key role in the long-term running of the arrangement. Then relational contracts become self-enforcing informal agreements in lots of long-term relationship in which official contracts are too expensive to form. Sometimes agreements with the character of relational contracts can also be reached between different countries as no institution is able or willing to supervise the fulfillment of both sides. In such a case, a money transfer plays an essential role in the relationship. A money transfer that maintains the process of the relational contract can be in form of tax, fine, or some other reimbursement. The complete performance of the relational contract, including the money transfer, will lead to expected 2

interests satisfying all participants. Of course, we have to think about the possible bad outcomes. Since the relational contract performance is largely linked to personal interests and then, when participants find their own interests are failed to meet, or said that access to more benefits seemingly, probably they will choose to betray the original relational contract content. Under this circumstance, the other participants of the contract, of course, will take appropriate response measures, which are usually called punishment, for safeguard of their own interests and suppressing the benefit of the betrayal. Because of the existence of such punishment mechanism, the people in the normal circumstances will not choose to violate the content of the contract. The above mentioned complex process can be described by an infinite repeated game perfectly, which is pretty convenient for us to make some furthermore researches about the properties of the relational contracts. In my dissertation, I specialize this case in which relational contract is made with required money transfer and supported by punishment strategies. Driven by rational minds, the cooperation of all the participants will make the whole game process tend to dynamic stable state. The existence and form of stationary contracts in infinitely repeated twoplayer games have been discussed earlier in [Susanne Goldlücke, 2013] [13]. In our paper we also explore contracts in an n-person infinite-stage game, which is more general. This dissertation is organized as follows. In section 2 we give some knowledge about the background of relational contract. In section 3 we introduce basic definitions and theorems from game theory. In section 4 we introduce the 3

stationary contracts in infinitely repeated two-player games. Section 5 is our main part, in which we describe our own game model and then define the Strong Nash Equilibrium in an n-person infinite-stage game. In the end of section 5 we give the proof of the existence of Strong Nash Equilibrium and the form of Strong Nash Equilibrium strategies. In section 6 we analyze the advantages and disadvantages of our methods compared with the above mentioned approach. 4

3 3.1 About game theory Two-person zero-sum game Definition 3.1 We call system Γ = (X, Y, K) (3.1) a standard two-player zero-sum game, where X and Y are non-empty sets, K : X × Y → R1 is a real-valued function. In game Γ, elements x ∈ X and y ∈ Y are called strategies of player 1 and 2 respectively. Elements of Cartesian product X × Y ( namely the strategy doublet (x, y), where x ∈ X and y ∈ Y ) are called situations, and function K is the payoff function of player 1. Under situation (x, y) the payoff of player 2 is (−K(x, y)) , so K is called the payoff function of game Γ and Γ is called zero sum game. To give a game Γ , we need to define the strategy sets X and Y of player 1 and 2, and simultaneously define the payoff function K on situation X ×Y . Game Γ can be explained as: players simultaneously and independently choose their strategies x ∈ X and y ∈ Y , then player 1 gets payment K(x, y), and player 2 gets payment (−K(x, y)). Definition 3.2 Game Γ0 = (X 0 , Y 0 , K 0 ) is called the subgame of game Γ = (X, Y, K), if X 0 ⊂ X and Y 0 ⊂ Y . Function K 0 : X 0 × Y 0 → R1 is the constraint of function K on X 0 × Y 0 . Suppose in game (3.1) player 1 has m choices of strategies. Sort the strategy set X of the first player, namely construct a one-to-one mapping function 5

between set M = {1, 2, ..., m} and X. Similarly if player 2 has n strategies we can construct a one-to-one mapping function between set N = {1, 2, ..., n} and Y . Then game Γ can be can be completely determined by matrix A = {aij }, where aij = K(xi , yj ), (xi , yj ) ∈ X × Y, i ∈ M, j ∈ N . The game is carried out in the following manner: player 1 chooses row i ∈ M and player 2 chooses column j ∈ N . Both of them make their choices simultaneously and independently. Hence player 1 gets payoff aij and player 2 gets payoff (−aij ). If the payment is a negative number, we can regard it as the actual loss of a player. Example 2.1 Player 1 and 2 choose integers i and j from set {1, 2, ..., n}. Player 1 wins |i − j| and the game is a zero-sum game. The payoff matrix is an n × n matrix in which aij = |i − j|. When n = 5, the payoff matrix is 3.2        A=      0 1 2 3 4  1 0 1 2 3   2 1 0 1 2   3 2 1 0 1   4 3 2 1 0 Maximin and minimax strategies Consider a two-player zero-sum game Γ = (X, Y, K), in which each player maximizes his own payoff through choosing strategy. The payoff of player 1 is determined by function K(x, y), and the payoff of player 2 is (−K(x, y)), namely the purposes of the players are exactly opposite. Note that the payoff of player 1(2) is determined by the situation (x, y) ∈ X × Y formed in game. In every situation, the payment of one player in game is not only determined by 6

his own choice, but also depends on what strategy his opponent chooses. The opponent’s purpose and his purpose are directly opposite, so when trying to get maximum payment, each player must take into account each other’s behavior. In game theory we suppose that strategies of all players are rational. Namely player 1 devotes himself to get maximum payoff in the case of player 2 choosing his own most favorable strategy. What can player 1 guarantee for himself? Suppose player 1 chooses strategy x, then in the worst case he will get min K(x, y). So what player 1 can always guarantee for himself is payoff y max min K(x, y). If giving up the assumption of extreme value, then player can x y always make his payoff infinite approach to numerical value g = sup inf K(x, y) (3.2) x∈X y∈Y which we call the lower value of the game. If external extreme value is reached, then g is also called maximin value. The principle that constructs strategy x on the base of maximizing the minimum payoff is called maximin principle, and the strategy made according to this principle is called maximin strategy. We have the same discussion for player 2. If player 2 chooses strategy y, then in the worst case he will lose max K(x, y). So player 2 can always guarantee x that he loses no more than min max K(x, y). y x We call ḡ = inf sup K(x, y) (3.3) y∈Y x∈X the upper value of the game. If external extreme value is reached, then ḡ is also called minimax value. The principle that constructs strategy y on the base of minimizing maximum loss is called minimax principle, and the strategy made according to this principle is called minimax strategy. It should be empha7

sized that the existence of maximin (minimax) strategy is determined by the reachability of external extreme value in formula (3.2) (or (3.3)). 3.3 Saddle point Consider the optimal behavior problem in a two-player zero-sum game. In game Γ = (X, Y, K), we call the situation (x∗ , y ∗ ) ∈ X × Y is optimal if any player can’t get benefit by deviating from it. The optimal principle of constructing equilibrium situation is called equilibrium principle. We’ll see later that for a two-player zero-sum game this equilibrium principle is equivalent to minimax and maximin principle. Of course we need the existence of equilibrium, which means a equilibrium principle is achievable. Definition 3.3 In a two-player zero-sum game Γ = (X, Y, K), situation (x∗ , y ∗ ) is called saddle point, if for all x ∈ X and y ∈ Y K(x, y ∗ ) 6 K(x∗ , y ∗ ) K(x∗ , y) > K(x∗ , y ∗ ) Definition 3.4 Provided (x∗ , y ∗ ) is a saddle point of game Γ then we call g = K(x∗ , y ∗ ) (3.4) the value of the game. Theorem 3.1 In game Γ = (X, Y, K), the sufficient and necessary condition for the existence of saddle point is that there exist minimax value and maximin value min sup K(x, y), max inf K(x, y) y x x 8 y

which satisfy equality g = max inf K(x, y) = min sup K(x, y) = ḡ x 3.4 y y x N -person non-zero-sum game Definition 3.5 We call Γ = (N, {Xi }i∈N , {Hi }i∈N ) non-cooperative game, where N = {1, 2, ..., n} is the set of players, Xi is the strategy set of player i, Hi is player i’s payoff function which is defined on n Y Cartesian product X = Xi of all players’ strategy sets. i=1 The n-person non-zero-sum game is carried out in the following manner. Players simultaneously and independently choose their strategies from their strategy sets Xi , i = 1, 2, ..., n. And then situation x = (x1 , x2 , ..., xn ), xi ∈ Xi is formed. Each player gets payoff Hi (x) and the game ends. If the pure strategy set Xi of player i is finite, then we call the game finite n-person non-cooperative game. 3.5 Nash equilibrium Suppose that in game Γ each player tries to realize the situation which can maximize his payment. But payoff function Hi (x) depends on not only the behavior of player i, but also the behaviors of other players in the game. So the situation that brings maximum payoff for player i may be not the best situation for other players. As in zero-sum game each player tries to get his maximum payment with conflicting characteristics, describing the best behavior 9

is a problem. Here we have many possibilities to formulate the meaning of optimality one of which is Nash equilibrium ( and its transformations). In zerosum game, Nash equilibrium is consistent with the optimal principle in zero-sum game. Suppose x = (x1 , ..., xi−1 , xi , xi+1 , ..., xn ) is an arbitrary situation in game Γ, xi is a strategy of player i. If player i’s strategy xi is changed to xi 0 , situation x will be changed to (x1 , ..., xi−1 , xi 0 , xi+1 , ..., xn ), which is written by (x||xi 0 ). Obviously if xi equals to xi 0 , then (x||xi 0 ) = x. Definition 3.6 We call situation x∗ = (x1 ∗ , ..., xi ∗ , ..., xn ∗ ) Nash equilibrium, if for all xi ∈ Xi and i = 1, 2, ...n the following inequality holds Hi (x∗ ) > Hi (x∗ ||xi ) (3.5) By the definition of Nash equilibrium, we can see that any player i alone will not be interested in deviation from the strategy xi ∗ constituting Nash equilibrium x∗ . Because when the remaining players continue to choose strategies constituting Nash equilibrium x∗ , replacing the behavior xi ∗ by xi can only reduce his payment. In this case, if all players reach an agreement in advance that they will choose the strategies constituting equilibrium situation, then the individual deviation from the contract is unfavorable to the side of the deviation. However, an important characteristic of Nash equilibrium is that the deviation of two or more players may cause that one of deviators gets increased payment. Suppose S ⊂ N is a subset (coalition) of the set of all players and x = (x1 , x2 , ..., xn ) is a situation in game Γ. If coalition S changes their strategy xi , i ∈ S to x0i ∈ Xi , i ∈ S, then situation x turns into (x||x0S ). (x||x0S ) means coalition S takes strategy xS and other players still follow strategy x. If x∗ is 10

Nash equilibrium, according to (3.5), generally we shouldn’t have Hi (x∗ ) > Hi (x∗ ||xS ) (3.6) Through satisfying (3.6) we can strengthen the concept of Nash equilibrium. Definition 3.7 We call situation x∗ strong Nash equilibrium, if for an arbitrary Y Xi , the following inequality is satisfied coalition S ⊂ N and xS ∈ i∈S X ∗ Hi (x ) > i∈S X Hi (x∗ ||xS ) (3.7) i∈S Of course, arbitrary strong Nash equilibrium is Nash equilibrium. 3.6 Multistage game In order to define the finite multi-stage n-person game with complete information, we need some basic knowledges of graph theory. Suppose X is a finite set and f is a rule corresponding each element x ∈ X to element f (x) ∈ X. X is also called a single-valued mapping from X to X or a function that defines on X and values on X. A set-valued mapping F from X to X is a rule corresponding element x ∈ X to some subset Fx ∈ X (including the case Fx = ∅). To make it clear, the following multiple-value mappings are also expressed in terminologies of mapping. Suppose F is a mapping from X to X and A ⊂ X. The image of set A is set FA = [ x∈A 11 Fx

According to the definition above, suppose F (∅) = ∅. We can prove if Ai ⊂ X, i = 1, 2, ..., n, then [n [n \n \n F( Ai ) = F Ai , F ( Ai ) ⊂ F Ai i=1 i=1 i=1 i=1 Define F 2 , F 3 , ..., F K , ... in following way: Fx2 = F (Fx ), Fx3 = F (Fx2 ), ..., Fxk = F (Fxk−1 ), ... The mapping F̂ from set X to X is called the transfer coverage of mapping, if F̂x = {x} ∪ Fx ∪ Fx2 ∪ · · · ∪ Fxk ∪ · · · Mapping F −1 is the inverse mapping of mapping F , defined as Fy−1 = {x|y ∈ Fx } Similar to mapping Fxk , we can define mapping (F −1 )ky (F −1 )2y =F −1 (F −1 )y (F −1 )3y = F −1 ((F −1 )2y , · · · , (F −1 )ky = F −1 ((F −1 )k−1 y If B ⊂ X, then suppose F −1 (B) = {x|Fx ∩ B 6= ∅} Definition 3.8 We call tuple (X, F ) a graph, if X is an finite set and F is a mapping from set X to the set of subsets of X. Denote graph (X, F ) by G. The elements of set X are denoted by points 12

on the plane. A pair of points x and y(y ∈ Fx ) are connected by an arrowed continuous curve (from x to y). Then each element of set X is called a node of the graph and element tuple (x, y) is called an arc of graph (y ∈ Fx ). For arc p = (x, y), we call nodes x and y boundary nodes of the arc if x is the initial node and y is the end node. Denote the set of arcs on graph by P . The set of arcs on graph G = (X, F ) determines mapping F . On the contrary, mapping F determines set P . So graph G can be described as G = (X, F ) or G = (X, P ). We call sequence of arcs p = (p1 , p2 , ..., pk , ...) a path in graph G = (X, F ), in which the end node of a previous arc corresponds to the initial node of a subsequent arc. The length of a path p = (p1 , p2 , ..., pk , ...) is l(p) + 1, the number of nodes concluded in the sequence of arcs. Specially l(p) = ∞ for an infinite path. The set consisting of two elements x, y ∈ X is called edge of G = (X, F ), here if (x, y) ∈ P or (y, x) ∈ P . Different from arcs, the direction of edges is not important. Denote edges by p, q, and similarly we denote the set of all edges by P . The sequence of edges p = (p1 , p2 , ..., pk , ...) is called chain, in which a boundary node of each edge pk is a boundary node of pk−1 and the other boundary node is a boundary node of pk+1 . Circle is a finite chain that starts from one node and ends at the same node. We call a graph connected if any two arbitrary nodes of the graph can be linked by a chain. According to definition, a tree or tree graph is a finite connected graph which has at least two nodes and no circles. In an arbitrary tree graph there exists only one node x0 that makes F̂x0 = X. We call node x0 the root of graph 13

G. Figure (3.1) is a tree or tree graph with a root x0 . Denote nodes x ∈ X of the tree graph in natural way and arcs by arrowed segments. Figure 3.1: Tree Suppose z ∈ X, denote the subgraph Gz of tree graph G = (X, F ) by (Xz , Fz ), Xz = F̂z , Fz x = Fx ∩ Xz . In figure (3.2) the dotted line circles a subgraph starting from node z. In the tree graph, for all x ∈ Xz , set Fx is consistent with set Fz x, namely mapping Fz is restriction of mapping F on set Xz . Hence denote the subgraph of the tree graph by Gz = (Xz , F ). 14

Figure 3.2: Subgraph of tree Then we define multistage game with complete information on tree graph. Suppose G = (X, F ) is a tree graph, we divide the node set X into n+1 sets X1 , ..., Xn , Xn+1 , ∪n+1 i=1 Xi = X, Xk ∩ Xl = ∅, k 6= l. Here for x ∈ Xn+1 , we have Fx = ∅. We call set Xi , i = 1, 2, ..., n the set of personal positions of player i and set Xn+1 terminal state set. Define n functions H1 (x), H2 (x), ..., Hn (x), x ∈ Xn+1 on terminal state set Xn+1 . Function Hi (x), i = 1, 2, ..., n is called payoff of player i. Hence the game progresses in following manner. Players are indexed by set N (N = {1, 2, ..., n}). Suppose x0 ∈ Xi1 , then player i1 takes behavior at node x0 and chooses x1 ∈ Fx0 . If x1 ∈ Xi2 , then player i2 takes behavior at node x1 and chooses x2 ∈ Fx1 . In this way, if on stage k the node occurs xk−1 ∈ Xik , 15

then player ik takes behavior and chooses subsequent node from set Fxk−1 . Once a terminal state node xl ∈ Xn+1 is reached, for which Fxl = ∅, the game ends. Sequential choices of states can uniquely realize a node sequence x0 , ..., xk , ..., xl , which determines a path from root x0 to a terminal state node in graph G. Because of the tree structure of the graph G, every path uniquely determines the accessible terminal state xl . On the contrary the terminal state xl uniquely determines the path. At node xl each player i gets payoff Hi (xl ). Suppose player i knows node x when finishing choice at node x ∈ Xi . Because of the tree structure of graph G, he can restore all the states before. And then we say that players have perfect information. Definition 3.9 We call the single-valued mapping from each node x ∈ Xi to a node y ∈ Fx the strategy of player i. Denote by Ui the set of all possible strategies of player i. At an arbitrary node x of player i’s personal positions Xi , the strategy of player i determines an uniquely choice for the next state. Sequence u = (u1 , ..., ui , ..., un ) is called a situation of the game, where n Y ui ∈ Ui and U = Ui is called the set of situations. Each situation u = i=1 (u1 , ..., ui , ..., un ) uniquely determines a path and the payoffs of players. In fact, provided x0 ∈ Xi1 , under situation u = (u1 , ..., ui , ..., un ) the next state x1 will be determined uniquely by rule ui1 (x0 ) = x1 . Provided x1 ∈ Xi2 , the next state x2 will be determined uniquely by rule ui2 (x1 ) = x2 . If state xk−1 ∈ Xik , then xk will be determined uniquely by rule xk = uik (xk−1 ), and so on. If situation u = (u1 , ..., ui , ..., un ) corresponds to a path x0 , x1 , ..., xl in the sense above, then we can introduce the concept of payoff function Ki for player i. The value of payoff function under each situation u equals to the value of 16

Hi at the terminal state of the path x0 , x1 , ..., xl corresponding to situationu = (u1 , ..., ui , ..., un ) namely Ki (u1 , ..., ui , ..., un ) = Hi (xl ), i = 1, 2, ..., n Function Ki , i = 1, 2, ..., n is defined on the situation set U = n Y Ui . After i=1 constructing the strategy set Ui and payoff function Ki , i = 1, 2, ..., n, we get a game in normal form Γ = {N, {Ui }i∈N , {Ki }i∈N } Here N = {1, ..., i, ..., n} is the set of players, Ui is the strategy set of player i and Ki is the payoff function of player i, i = 1, 2, ..., n. In order to do some further study of strategic behavior in Γ, we need to introduce the concept of subgame, namely game on subgraph of basic game graph G. Suppose z ∈ X, consider subgraph Gz = (Xz , F ) which corresponds to subgame Γz in the following way. In subgame Γz , the set of personal position of a player is determined by rule Yiz = Xi ∩ Xz , i = 1, 2, ..., n. Terminal state set z is Yn+1 = Xn+1 ∩ Xz . The payoff Hiz (x) of player i in subgame is z Hiz (x) = Hi (x), x ∈ Yn+1 , i = 1, 2, ..., n Hence in subgame Γz the strategy uzi of player i is defined as restriction of player i’s strategy of game Γ on set Yiz , namely uzi (x) = ui (x), x ∈ Yiz = Xi ∩ Xz , i = 1, 2, ..., n In subgame the strategy set of player i is denoted by Uiz . Then each 17

subgraph is corresponds to a subgame in normal form Γz = (N, {Uiz }, {Kiz }) where payoff function Kiz , i = 1, 2, ..., n is defined on Cartesian product U z = n Y Uiz . i=1 18

4 4.1 Stationary contracts in two-person case The game model In the game we have already introduced, players know all information before current stage. In current stage, they make their own behaviors one by one. But now we’re going to introduce a slightly different game. In this game, players only know what happened before current stage. They have no idea about what their opponents are doing in the happening stage as they make simultaneously behaviors. Consider an infinitely repeated two-player game Γ with perfect information. Player are indexed by i, j ∈ {1, 2}. Each game happens in a period t, and we denote the game as Γt . Every period t contains two substages: a payment stage and a play stage. Players make a money transfer to each other in a payment stage and they choose behaviors simultaneously. Denote the continuous payoff function in stage game of the play stage by g(a) = (g1 (a1 , a2 ), g2 (a1 , a2 )) : A1 × A2 → R × R where the set Ai is the compact action space of player i, a = (a1 , a2 ) is the action profiles of this stage game and gi (a1 , a2 ) is the payoff of player i. Denote by A = A1 × A2 the set of all action profiles. We denote the sum of payoffs of two players from an action a = (a1 , a2 ) by G(a) = g1 (a) + g2 (a) 19

The maximal payoffs or cheating payoffs of player 1 and 2 are given by c1 (a) = max g1 (a) a c2 (a) = max g2 (a) a Each period starts from a payment stage. In a payment stage each player makes a money transfer to the other player. Here we assume that each player has enough amount of money so the situation that money is used up would not be considered. We denote by pij the money transfer from player i to player j. Thus we have players’ net payment p1 = p12 − p21 p2 = p21 − p12 Obviously we have p1 = −p2 and we generally describe net payments of players by vector p = (p1 , p2 ). So that player i’s payoff in a period with net payments p = (p1 , p2 ) and action profile a is equal to gi (a) − pi . Given a path of the game starting from time τ : (pτ , aτ , pτ +1 , aτ +1 , ...) The average discounted continuation payoff of player i is (1 − δ) ∞ X δ t−τ (gi (at ) − pi t ) t=τ And given a path (aτ , pτ +1 , aτ +1 , ...) that starts from a play stage, the 20

function is (1 − δ) ∞ X δ t−τ (gi (at ) − δpi t+1 ) t=τ Denote by ui the strategy of player i. As the definition before, ui tells player i what to do on each stage. To make a decision correct, a strategy ui needs to have a very important relationship with the history before the certain stage of the game. A history ending before stage k ∈ {pay, play} in period t is a list of all payments and actions which have been made by players before stage k. Denote by hk all the histories that end before stage k. Each decision in each position should be made by the player after taking the current history into consideration. According to this, we can describe ui , the strategy of player i, in this way: Definition 4.1 The strategy of player i is a function of history before any stages in the game and tells player i to choose an appropriate behavior on the consequent stage: ui (hk ) = ai , k = pay , ai ∈ Ai ui (hk ) = pi , k = play We write ui |h for the strategy profile of player i following history h. Hence gi (ui |h) describes player i’s average discounted payoff by following strategy ui |h. g(u|h) = (g1 (u1 |h), g2 (u2 |h)) is the vector of continuation payoffs, where u = (u1 , u2 ). And G(u|h) = g1 (u1 |h) + g2 (u2 |h) denote the joint continuation payoff. By SP E we mean the set of subgame perfect (continuation) equilibriums that start in stage k. If u is a subgame perfect equilibrium, we call g(u) a 21

subgame perfect payoff. The set of subgame perfect payoffs at stage k is denoted by GkSP E = {g(u) : u ∈ SP E} Note that we restricted the analysis to pure strategies. We assume that the stage game has a Nash equilibrium in A, which is a strong requirement. 4.2 Stationary contracts We define a stationary strategy profile by a triple of action profiles (ae , a1 , a2 ), which is also called action plan, and a payment plan (p0 , pe , F 1 , F 2 , f 1 , f 2 ). The progresses are as follows. In the beginning of the game (period 0), players get up-front payments p0 = (p01 , p02 ). If both players make the expected payment on the payment stage, the equilibrium action ae = (ae1 , ae2 ) will be played on the next play stage. Whenever player i unilaterally deviates from a prescribed action, he must pay a fine Fii > 0 to player j on the subsequent payment stage. If player i deviates from a prescribed payment, as punishment action ai = (ai1 , ai2 ) will be made on the next play stage and payment f i = (f1i , f2i ) will be made on the subsequent payment stage. We can see progress of stationary strategy proles in figure (4.1). 22

Figure 4.1: Path If player i unilaterally deviates from the prescribed path, then player i’s punishment path will follow. In this game, a stationary strategy profile consists of the equilibrium path (p0 , ae , pe , ae , pe , ...) and two types of punishment paths for each player i, depending on whether the deviation happens on play stage or on payment stage: (F i , ae , pe , ae , pe , ...) or (ai , f i , ae , pe , ae , pe , ...). Any unilaterally deviation from the equilibrium path will be punished by the same continuation equilibrium, which is the worst possible subgame perfect equilibrium for the deviated player. In this game, the punishment paths have such a structure: action ai must have a low enough cheating payoff ci (ai ) to prevent a deviation from player i. The payment fi i is used as a lower fine that adjusts the punishment so that neither the punished player i nor the punishing player j has a motivation to deviate from the punishment prole ai . There are two different punishment paths because a punishment can hap23

pen in the play or payment stage. As it is mentioned in the earlier paper [Abreu. 1988.] [1], the punished player can have the same payoff in two kinds of punishment paths if we define the lower fine fii as fii = Fii uii − gi (ai ) − δ (4.1) Fixing the lower fine like this doesn’t restrict the possibility of stationary strategy proles to characterize optimal subgame perfect. Similarly, we will assume that an action plan can always fulll the following two conditions for the players: 1.G(ae ) > G(ai ), i = 1, 2 2.ci (ai ) < ci (ae ), i = 1, 2 Definition 4.2 A stationary strategy profile that constructs a subgame perfect equilibrium is called a stationary contract. In the following we introduce conditions that lead to subgame perfection of a stationary contract profile. We denote the continuation payoffs of two players on the equilibrium path by ue1 = g1 (ae ) − δpe1 ue2 = g2 (ae ) − δpe2 The continuation payoffs of two players if player i deviates are given by ui1 = −(1 − δ)F1i + ue1 ui2 = −(1 − δ)F2i + ue2 To proof that a stationary contract is a subgame perfect equilibrium, we 24

need to check that there are no profitable deviations for both players. If player i complies with his punishment, his payoff will be uii no matter which stage the punishment starts in. If he deviates once and complies afterwards, his payoff will be uii or (1 − δ)ci (ai ) + δuii , depending on which stage the punishment starts in. So player i will not deviate from his punishment if uii > ci (ai ) Let us see the circumstance of player j in punishment of player i. First we need to ensure that player j doesn’t deviate from the punishment profile ai . Besides, he pays the reward in case fji > 0. It can be checked easily that both conditions are fulfilled if and only if (1 − δ)G(ai ) + δG(ae ) − uii > (1 − δ)cj (ai ) + δujj Left side of equation is the payoff of player j when player i is punished on the play stage. The right side is his payoff if he cheats on the play stage. If the game proceeds on the equilibrium path, the actions ae and payments pe are achieved if and only if for player 1 and player 2 ue1 > (1 − δ)c1 (ae ) + δui1 ue2 > (1 − δ)c2 (ae ) + δui2 An up-front payment p0 is subgame perfect if for player 1 and player 2 the following conditions are satisfied −(1 − δ)p01 + ue1 > ui1 −(1 − δ)p02 + ue2 > ui2 25

Theorem 4.1 There exists a stationary contract with action plan (ae , a1 , a2 ) if and only if G(ae ) > (1 − δ) (c1 (ae ) + c2 (ae )) + δ c1 (a1 ) + c2 (a2 ) (4.2) and for player 1 and player 2 (1 − δ)G(a1 ) + δG(ae ) − c1 (a1 ) > (1 − δ)c2 (a1 ) + c2 (a2 ) 2 e 2 2 (4.3) 1 (1 − δ)G(a ) + δG(a ) − c2 (a ) > (1 − δ)c1 (a ) + c1 (a ) The condition (4.2) ensures that both player1 and player 2 don’t want to deviate from the equilibrium action profile ae . The joint payoff by following the equilibrium path G(ae ) must be larger than the joint payoffs from cheating now and being punished in the future. Condition (4.3) ensures that if player i is punished no player wants to deviate from the punishment of player i. 26

5 5.1 N -person game model Description of the game Consider about an n-person infinite-stage game Γ. Players are indexed by i ∈ N = {1, 2, ..., n} and in each stage they choose their behaviors simultaneously. On the first stage of the game they select an action a1 = (a11 , a12 , ..., a1n ) simultaneously and independent of each other, and in the second stage each player make transfer payments to all the other players respectively, which we denote by a whole part p2 = (p21 , p22 , ..., p2n ). Then the similar actions are repeated on the later stages: in odd stages, simultaneous actions )(k = 1, 2, ...) will be made, and in even stages bea2k−1 = (a2k−1 , a2k−1 , ..., a2k−1 n 1 2 2k 2k haviors in form of payments p2k = (p2k 1 , p2 , ..., pn )(k = 1, 2, ...) will be realized. The sequence of behaviors in the game will be: a1 , p2 , a3 , p4 , ..., a2k−1 , p2k , ...(k = 1, 2, ...) ), where a2k−1 is the behavior Here a2k−1 = (a2k−1 , a2k−1 , ..., ai2k−1 , ..., a2k−1 n 1 2 i of player i on stage 2k − 1, a2k−1 ∈ Ai , Ai is infinite behavior space of player i. i We denote the payoff of player i on stage 2k − 1 by gi2k−1 (a2k−1 ), then we have the joint payoff of all players on stage 2k − 1: G2k−1 (a2k−1 ) = X gi2k−1 (a2k−1 ) i∈N 2k Denote by p2k i the net transfer of player i on stage 2k, and pij is the 27

transfer from player i to player j on stage 2k, we have p2k i = X p2k ij − X p2k ji i6=j i6=j and 2k 2k 2k p2k = (p2k 1 , p2 , ..., pi , ..., pn ) It should be noted here that there is a constraint for p2k ij : 06 n X 2k−1 2k−1 p2k (a ) ij =gi j=1 which indicates that the player i can’t give the others more than what he got in the previous stage. 5.2 Tree and history The n-person infinite-stage game Γ can be described more clearly by an in- finite directed graph. Here we construct an infinite tree graph T . Denote all the nodes of this tree by X which is divided into sets X 0 , X 1 , X 2 , ..., X 2k−1 , X 2k ...(k = 1, 2, ...). X 2k−1 (X 2k ) is a set of possible positions that can be realized on stage 2k − 1(2k). And nodes x0 , x1 , x2 , ..., x2k−1 , x2k ...(k = 1, 2, ...) are actually positions in all stages formed in the game, x2k−1 ∈ X 2k−1 , x2k ∈ X 2k . The number of arcs which have a node xi as their initial node is called the outdegree of node xi , and similarly the number of arcs which have xi as their final node is called the indegree of node xi . From figure (5.1) we can see the indegree of each node is one, but all nodes can have a larger outdegree, meaning each node connected with x2k−1 except x2k−2 can possibly be next position x2k . The most important thing is that every 28

position is reached by simultaneous behaviors of all players in the stage. That is to say, x2k−1 corresponds to a2k−1 and x2k corresponds to p2k . Specially X 0 = x0 means the initial root. Figure 5.1: Tree Hence game Γ progresses in the following manner: at x0 , the beginning of the game, all players take actions a1 = (a11 , a12 , ..., a1i , .., a1n ) together and arrive 29

at node x1 ∈ X 1 . Then payments p2 = (p21 , p22 , ..., p2i , ..., p2n ) which lead to x2 ∈ X 2 will be made, where p2i = X p2ij − i6=j X p2ji . i6=j Like this, in odd stages players take actions as a2k−1 = (a2k−1 , a2k−1 , ..., a2k−1 , ..., a2k−1 ) n 1 2 i leading to x2k−1 and in even stages they transfer money by payments 2k 2k 2k p2k = (p2k 1 , p2 , ..., pi , ..., pn ) that lead to x2k . It’s important for us to understand how players make decisions and thus choose the next position. Above of all, let’s see what a strategy is in this game. We have already given the definition of strategy in multistage game by definition (3.9). Here it will be very similar. For player i, he has a strategy ui following which he can know what to do in all possible position in the game. So a strategy ui not only include the actually decisions in the game but also tells player i to choose a correct next behavior in all possible nodes of the tree. To make a decision correct, a strategy ui need to have a very important relationship with the history before the certain stage of the game. A history progressing before stage 2k − 1(2k) is a list of all actions and payments which have occurred before this stage. Let h2k−1 (h2k ) be the history that progress before stage 2k − 1(2k). A history means a path, indicating that 30

each node except x0 comes with a unique path, along which we can arrive the position starting from root x0 . Like the example in the following figure (5.2), we see clearly that the history of x5 is the bold path. Figure 5.2: History Each decision in each position should be made by the player after taking 31

the current history into consideration. According to this, we can describe ui , the strategy of player i, in this way: Definition 5.1 The strategy of player i is a function of history before any stages in the game and tells player i to choose an appropriate behavior on the consequent stage: ui (h2k−1 ) = a2k−1 , ui (h2k ) = p2k i i (5.1) Then (u1 (h2k−1 ), u2 (h2k−1 ), ..., un (h2k−1 )) form behavior a2k−1 on stage 2k− 1 and direct to x2k−1 . Similarly (u1 (h2k ), u2 (h2k ), ..., un (h2k )) form behavior p2k on stage 2k and direct to x2k . 5.3 Cooperative trajectory Since the players following their strategies they can make wise next actions which bring most benefit to them. Definition 5.2 We call ã = (ã1 , p̃2 , ã3 , p̃4 , ..., ã2k−1 , p̃2k , ...) cooperative trajectory if the following condition is satisfied: ∞ X X [δ 2k−2 gi2k−1 (ã2k−1 ) + δ 2k−1 p̃2k i ] i k=1 i∈N = max a2k−1 p2k i i ∞ X X [δ 2k−2 gi2k−1 (a2k−1 ) + δ 2k−1 p2k i ] (5.2) i k=1 i∈N where 2k 2k 2k ã2k−1 = (ã2k−1 , ã2k−1 , ..., ãi2k−1 , ..., ã2k−1 ), p̃2k = (p̃2k n 1 , p̃2 , ..., p̃i , ..., p̃n ) 1 2 Actually the cooperative trajectory formed in the game is a path which maximizes the collective interest, which requires the conscious cooperation of 32

all players in the game Γ. On the other hand, the cooperative trajectory ã = (ã1 , p̃2 , ã3 , p̃4 , ..., ã2k−1 , p̃2k , ...) is also directed by strategies of all players. Respectively if x̃0 , x̃1 , x̃2 , ..., x̃2k−1 , x̃2k ...(k = 1, 2, ...) are nodes consisting the cooperative path, so we can denote history like h̃2k−1 = x̃0 , x̃1 , x̃2 , ..., x̃2k−1 (k = 1, 2, ...) h̃2k = x̃0 , x̃1 , x̃2 , ..., x̃2k−1 , x̃2k (k = 1, 2, ...) or h̃2k−1 = ã1 , p̃2 , ã3 , p̃4 , ..., ã2k−1 (k = 1, 2, ...) h̃2k = ã1 , p̃2 , ã3 , p̃4 , ..., ã2k−1 , p̃2k (k = 1, 2, ...) In the following figure (5.3) the bold line is the cooperative trajectory. 33

Figure 5.3: Cooperative trajectory As Γ is an n-person infinite-stage game, we denote subgames by Γ2k−1 and Γ2k . Subgames are functions of histories before respectively stages. Γ2k−1 = Γ(h2k−1 ) means the subgame starts from stage 2k − 1, while Γ2k = Γ(h2k ) describes the subgame since stage 2k. Obviously here we have Γ1 = Γ(h1 ) = Γ. Besides we describe each stage game as γ 2k−1 = γ(h2k−1 ) and γ 2k = γ(h2k ). 34

Denoting the cumulative payoff of player i since stage l by Cil , we can give the payoff of player i in two types of stage games and cumulative payoff in two types of subgames. In stage game γ 2k−1 the payoff of player i is gi2k−1 (a2k−1 ) and in stage game i γ 2k the payoff should be −p2k i X X 2k p2k pij − = −( ji ) i6=j i6=j Hence we have the cumulative payoff in subgame Γ2τ −1 (τ = 1, 2, ...): Ci2τ −1 = = ∞ X k=τ ∞ X X X [δ 2k−2 gi2k−1 (ai2k−1 ) − δ 2k−1 ( p2k − p2k ij ji )] i6=j i6=j (5.3) [δ 2k−2 gi2k−1 (ai2k−1 ) − δ 2k−1 p2k i ] k=τ And the cumulative payoff in subgame Γ2τ (τ = 1, 2, ...): Ci2τ = = ∞ X k=τ ∞ X X X 2k 2k+1 2k+1 [−δ 2k−1 ( p2k − p2k (ai )] ij ji ) + δ gi i6=j i6=j (5.4) 2k 2k+1 2k+1 [δ 2k−1 (−p2k (ai )] i ) + δ gi k=τ 5.4 Core So far we have discussed normal form of our game and the cooperative trajectory, which can be called ”relational contract”. As we say in the first part of this paper, since the relational contract performance is highly related with personal interests and when players find their own interests are failed to meet, or said that access to more benefits, they will choose to betray the original relational contract contents. In this case, the other participants of the contract, 35

of course, will take appropriate response measures, which are usually called punishment, for safeguard of their own interests and suppressing the benefit of the betrayal. In our case contract is cooperation, and here we start to consider the deviation from the cooperation trajectory in the game. If a coalition S(S ⊂ N ) deviates from the cooperative trajectory ã = (ã1 , p̃2 , ã3 , p̃4 , ..., ã2k−1 , p̃2k , ...) , immediately the coalition N \S will play against them in the next stage and keep punishment till the end, which forms a zero-sum game Γ̄(S, N \S, K). Suppose a saddle point exists in pure strategy game γ̄ 2k−1 (S, N \S): ∗ 2k−1 ∗ 2k−1 K(a2k−1 , a∗ 2k−1 , a N \S ) N \S ) 6 K(a S S ∗ 2k−1 2k−1 K(a∗ 2k−1 , a∗ 2k−1 , aN \S ) S N \S ) > K(a S here γ̄ 2k−1 (S, N \S) is a stage game of zero-sum game Γ̄(S, N \S, K). , a∗ 2k−1 Definition 5.3 We call K(a∗ 2k−1 S N \S ) the value of characteristic function of stage game γ̄ 2k−1 (S, N \S): v 2k−1 (S) = K(a∗ 2k−1 , a∗ 2k−1 S N \S ), (k = 1, 2, ...) Specially when S = N, v 2k−1 (S) = v 2k−1 (N ), and here v 2k−1 (N ) is the value of maximal joint payoff of all players. Although the players of coalition S get more interests by deviating in one stage, but due to the punishment taken by coalition N \S the coalition S can’t get more than v 2k−1 (S)(k = 1, 2, ...) in each odd stage later. Denote by D(S, l) the joint payoff of coalition S by deviating on stage l, G(S, l) the joint payoff of coalition S if they cooperate with others on stage l. 36

Assumption 5.1 The players can only get finite benefit from deviating in stage l(l = 1, 2, , ...) which means sup |D(S, l) − G(S, l)| = M, 0 < M < +∞, S ⊂ N, l(l = 1, 2, , ...) (5.5) l Provided on stage 2k − 1 player i gets share αi2k−1 in maximal joint payoff v 2k−1 (N ). Definition 5.4 We call vector α2k−1 = (α12k−1 , α22k−1 , ..., αn2k−1 ) an imputation of the stage game γ̄ 2k−1 (S, N \S) if the following conditions are satisfied αi2k−1 > v 2k−1 ({i}), i ∈ N N X αi2k−1 = v 2k−1 (N ) (5.6) (5.7) i=1 Here v 2k−1 ({i}) is the value of characteristic function for a single element coalition S = {i} in stage game γ̄ 2k−1 (S, N \S). Condition (5.6) is called an individual rationality condition, which means that each player can at least gets as much as he will earn when taking a behavior alone regardless of the cooperation with other players. The condition (5.7) N X should also be satisfied. Because when αi2k−1 < v 2k−1 (N ) there exists an imputation α 02k−1 αi2k−1 . And if N X i=1 that can make every player i, i ∈ N gets more than share αi2k−1 > v 2k−1 (N ), the player in N will share an impossibly- i=1 achieved payoff which makes α2k−1 unrealizable. So α2k−1 is available only when condition (5.7) is hold. Condition (5.7) is also called a collective rationality condition. Definition 5.5 We say that the imputation α2k−1 dominates β 2k−1 through 37

coalition S (denoted by α2k−1 <S β 2k−1 ), if αi2k−1 > βi2k−1 , i ∈ S (5.8) α2k−1 (S) 6 v 2k−1 (S) (5.9) Condition (5.8) indicates that for every member of coalition S α2k−1 is better than β 2k−1 . And condition (5.9) ensures the achievement of imputation α2k−1 for coalition S (namely coalition S can actually offer share αi2k−1 for every player i ∈ S). Definition 5.6 We say that the imputation α2k−1 dominates β 2k−1 if there exists a coalition S satisfying α2k−1 <S β 2k−1 . Denote that imputation α2k−1 dominates β 2k−1 by α2k−1 <β 2k−1 . Definition 5.7 A set of imputations which can’t be dominated in a cooperative game is called the core of the game. Suppose the core of subgame Γ2k−1 is not void, and denoted by C(Γ2k−1 ). A best imputation should be made for every player according to some rules of the initial relational contract. Theorem 5.1 The sufficient and necessary condition that an imputation α2k−1 = (α12k−1 , α22k−1 , ..., αn2k−1 ) belongs to the core of the game is v 2k−1 (S) 6 α2k−1 (S) = X αi2k−1 , S ⊂ N, k = 1, 2, ... (5.10) i∈s The proof is following. First we see the sufficiency. Provided a imputation α2k−1 (k = 1, 2, ...) satisfies the condition (5.10), we need to prove that α2k−1 belongs to the core. 38

If α2k−1 doesn’t belongs to the core, there will exist a imputation β 2k−1 making β 2k−1 <S α2k−1 , namely β 2k−1 (S) > α2k−1 (S) and β 2k−1 (S) 6 v 2k−1 (S), which is contradictory to condition (5.10). Then we prove the necessity. For any imputation α2k−1 that doesn’t satisfy condition (5.10), there exists a coalition S satisfying α2k−1 (S) 6 v 2k−1 (S) . Take βi2k−1 = αi2k−1 βi2k−1 v 2k−1 (S) − α2k−1 (S) + ,i ∈ S |S| v 2k−1 (N ) − v 2k−1 (S) = ,i ∈ /S |N | − |S| Here |S| denotes the number of players in coalition S. Obviously β 2k−1 (N ) = v 2k−1 (N ), βi2k−1 > 0, and β 2k−1 <S α2k−1 . So α2k−1 doesn’t belong to the core. The theorem above has been proved. But here we need a stricter condition. Suppose S ⊂ N , we have: X αi2k−1 > v 2k−1 (S), S ⊂ N, k = 1, 2, ... (5.11) i∈s Combining with the definition of cooperative trajectory, we have X i∈N αi2k−1 = G2k−1 (ã2k−1 ) = max a X gi2k−1 (a) = i∈N X gi2k−1 (ã2k−1 ) (5.12) i∈N On stage 2k player i transfers p2k ij > 0 to player j, so we get a transfer matrix P 2k = {p2k ij }. Actually here we have 2k−1 2k−1 δ p̃2k (ã ) − αi2k−1 i = gi (5.13) gi2k−1 (ã2k−1 ) is the payoff of player i on stage 2k − 1 if he follows the cooperative trajectory. αi2k−1 is the deserved part for player i from the imputation 39

α2k−1 , which can be simply understood as how much he would hold after transferring money to the other players. That is to say, the amount of p̃2k i is fixed by 2k gi2k−1 (ã2k−1 ) and α2k−1 . So the transfers p2k ij and pji between every two players need to meet a result of p̃2k i . Namely according to (5.13) the elements of transfer matrix P 2k must be defined as any solution of gi2k−1 (ã2k−1 ) − αi2k−1 = δ( X j:j6=i p2k ij X − p2k ji ) j:j6=i It is easy to see that such a matrix solution always exists and here we can give an example  P 2k 5.5    =    0 .. . 1 .. . ··· ... 1 .. . 1 1 ··· 0 · · · 1 − δ p̃n2k−1 1 − δ p̃2k 1 − δ p̃2k 1 2  1  ..  .    1  0 Strong Nash Equilibrium By definition (3.7) we have already known strong Nash equilibrium in gen- eral case. Given an n-person non zero-sum game Γ, the strong Nash equilibrium ũ = (ũ1 , ũ2 , ..., ũn ) is such n-tuple of strategies for which the following condition takes place: X i∈S gi (ũ) > X gi (ũN \S , uS ) i∈S gi (ũ) is the payoff of player i if coalition S follows strategies ũ = (ũ1 , ũ2 , ..., ũn ) together with other players, and gi (ũN \S , uS ) is the payoff of player i if coalition S takes strategies us instead while other players keep following ũ. In our event we still use ũ = (ũ1 , ũ2 , ..., ũi , ..., ũn ) to describe the strong Nash equilibrium and 40

ũi is strong Nash Equilibrium strategy of player i. The cooperative trajectory can be derived from: ũi (h̃2k−1 ) = ã2k−1 , ui (h̃2k ) = p̃2k i i where h̃2k−1 (h̃2k ) is the history before ã2k−1 (p̃2k ) of the cooperative trajectory. Player i makes a behavior ãi2k−1 (transfer p̃2k i ) by following the strategy ũi . And then the decisions in this stage of all players form ã2k−1 (p̃2k ), which contributes to the cooperative trajectory ã = (ã1 , p̃2 , ã3 , p̃4 , ..., ã2k−1 , p̃2k , ...) step by step. Once the players in the game find some coalition S deviating from the cooperative trajectory determined by the relational contracts, then they start the zero-sum game Γ̄(S, N \S, K) and the punishment strategies tell them to response to control the payoff of coalition S no more than v 2k−1 (S) in each subsequently odd stage. We can see the process more clearly in flow chart (5.4). Now we can derive the condition under which the strategy profile {ã, G} constitutes a strong Nash equilibrium. We have the following theorem Theorem 5.2 The cumulative payoff of player i in form of Ci1 = ∞ X [δ 2k−2 gi2k−1 (ã2k−1 ) − δ 2k−1 p̃2k i ] i k=1 can be attained in strong Nash equilibrium if −1 2k−1 2k−1 p̃2k (ã ) − αi2k−1 ] i = δ [gi and α2k−1 = (α12k−1 , α22k−1 , ..., αn2k−1 ) ∈ C(Γ2k−1 ) 41

Figure 5.4: Flow chart of process 42

5.6 Proof The game consists of two kinds of stages, action stages and payment stages, so the situation will be different as the coalition S choosing to deviate in different stages. First let’s see the deviating in payment stage. Suppose the coalition S deviates from strategies {α̃, G} on stage 2l, the transfer to other players from S will be zero since stage 2l (p2l ij = 0, i ∈ S, j ∈ N \S). Then on the next stage and till the end of the game the coalition N \S will play against them in zero-sum game Γ̄(S, N \S, K). The coalition N \S not only takes respectively response in odd stages later but also stops to transfer money to coalition S in subsequent even stages. And what the players from S can guarantee in each later odd stage is only v 2k+1 (S)(k > l), the value of the characteristic function of stage game γ̄S2k+1 . Here γ̄S2k+1 is the stage game of zero-sum game Γ̄(S, N \S, K). Deviating in payment stage of stage 2l means coalition S gets D(S, 2l) = X X X X p2l p2l without paying the expected transfer p̄ = ij > 0 to the ji i∈S j∈N \S i∈S j∈N \S coalition N \S. The total possible maximum payment of coalition S since stage 2l if they deviate in payment stage 2l is δ 2l−1 D(S, 2l) + δ 2l ∞ X δ 2k−2l v 2k+1 (S) k=l But the payoff of players from S when moving along cooperative trajectory on stage 2l is equal to G(S, 2l) = X X i∈S j∈N \S p2l ji − X X p2l ij i∈S j∈N \S According to (5.4) and (5.13) the payment of coalition S since stage 2l 43

without deviating is δ 2l−1 G(S, 2l) + δ 2l ∞ X (δ 2k−2l X αi2k+1 ) i∈S k=l Here α2k+1 = (α12k+1 , α22k+1 , ..., αn2k+1 ) ∈ C(Γ2k+1 ). To proof the theorem we need to find δ ∈ (0, 1) which satisfies δ 2l−1 G(S, 2l) + δ 2l ∞ X (δ 2k−2l X αi2k+1 ) i∈S k=l >δ 2l−1 D(S, 2l) + δ 2l ∞ X δ 2k−2l v 2k+1 (S) k=l So we have δ ∞ X X δ 2k−2l ( αi2k+1 − v 2k+1 (S)) > D(S, 2l) − G(S, 2l) i∈S k=l δ ∞ X k=l δ 2k−2l X ( αi2k+1 − v 2k+1 (S)) > p̄ i∈S Here α2k+1 = (α12k+1 , α22k+1 , ..., αn2k+1 ) belongs to the core C(Γ2k+1 ), so we X have αi2k+1 > v 2k+1 (S), S ⊂ N, k = 1, 2, ... according to (5.11). i∈s X Denote L = inf [ αi2k+1 −v 2k+1 (S)] > 0. k i∈S We have δ p̄ > 1 − δ2 L Because p̄ > 0, L > 0, and lim− δ→1 δ = +∞, so we can always find a 1 − δ2 44

δ → 1 − 0 (δ close to 1) that satisfies δ 2l−1 G(S, 2l) + δ 2l ∞ X (δ 2k−2l k=l X αi2k+1 ) i∈S >δ 2l−1 D(S, 2l) + δ 2l ∞ X δ 2k−2l v 2k+1 (S) k=l Now we consider the case that the coalition S deviates in action stage 2l − 1. Instead of ã2l−1 ⊂ ã2l−1 , the coalition S takes action aS0 2l−1 . The coalition S S can get X 0 2l−1 [gi2l−1 (ã2l−1 )] N \S , aS i∈S Denote D(S, 2l − 1) = max aS0 2l−1 X 2l−1 0 2l−1 [gi2l−1 (ãN )] \S , aS i∈S Since stage 2l the coalition S doesn’t pay the expected transfer payment X X p2l p̄ = ij > 0 to coalition N \S. Of course no transfer comes from i∈S j∈N \S coalition N \S and new action will be taken on stage 2l + 1 by them to play against the deviated coalition S till the end. Then the total payoff of coalition S after deviation from the stage 2l − 1 on will not exceed δ 2l−2 D(S, 2l − 1) + δ 2l ∞ X δ 2k−2l v 2k+1 (S) k=l According to (5.3) and (5.13) the payoff of players of S since stage 2l − 1 without deviating is δ 2l−2 G(S, 2l − 1) + δ 2l ∞ X k=l 45 (δ 2k−2l X i∈S αi2k+1 )

Here α2k+1 = (α12k+1 , α22k+1 , ..., αn2k+1 ) ∈ C(Γ2k+1 ). And to finish the proof we need to find δ ∈ (0, 1) that satisfies δ 2l−2 G(S, 2l − 1) + δ 2l ∞ X k=l (δ 2k−2l X αi2k+1 ) i∈S >δ 2l−2 D(S, 2l − 1) + δ 2l ∞ X δ 2k−2l v 2k+1 (S) k=l α2k+1 = (α12k+1 , α22k+1 , ..., αn2k+1 ) belongs to the core C(Γ2k+1 ), so we have X i∈s αi2k+1 > v 2k+1 (S), S ⊂ N, k = 1, 2, ... according to (5.11). X Denote L = inf [ αi2k+1 −v 2k+1 (S)] > 0. k i∈S Hence we have δ 2 ∞ X δ 2k−2l L > D(S, 2l − 1) − G(S, 2l − 1) k=l Denote M = sup[D(S, 2l − 1) − G(S, 2l − 1)], 0 < M < +∞ according to l (5.5). We have δ2 M > 1 − δ2 L r 1>δ> 46 M L+M

r So there always exists a δ ∈ ( δ 2l−2 G(S, 2l − 1) + δ 2l ∞ X (δ 2k−2l M , 1) satisfying L+M X αi2k+1 ) i∈S k=l >δ 2l−2 D(S, 2l − 1) + δ 2l ∞ X δ 2k−2l v 2k+1 (S) k=l In summary, we can always find a δ → 1 − 0 (δ close to 1) that satisfies the following two inequalities δ 2l−1 G(S, 2l) + δ 2l ∞ X (δ 2k−2l X αi2k+1 ) i∈S k=l >δ 2l−1 D(S, 2l) + δ 2l ∞ X δ 2k−2l v 2k+1 (S) k=l δ 2l−2 G(S, 2l − 1) + δ 2l ∞ X k=l (δ 2k−2l X αi2k+1 ) i∈S >δ 2l−2 D(S, 2l − 1) + δ 2l ∞ X δ 2k−2l v 2k+1 (S) k=l The above two inequalities mean that no matter deviating in an action stage or a payment stage the coalition S can’t get more benefit than following the cooperative trajectory. The theorem has been proved. 47

6 Conclusion In section 4 we have introduced the two-person game model and derived the condition for a stationary contract. We improve the model from section 4 and get some better results in n-person infinite-stage game in section 5. Compared with the old method, our new model has three advantages. First, we give the form of strong Nash equilibrium and prove the existence of strong Nash equilibrium. Strong Nash equilibrium is stable against deviation of any coalition. Besides, the new model doesn’t need the existence of Nash equilibrium in each stage game. In the discussion of section 3, there is one strong requirement that the stage game has a Nash equilibrium. The old model in section 4 only discusses stationary contracts in twoperson case. In my new model, an n-person game has been successfully constructed and analyzed, which is obviously more general. However there is also a disadvantage in my dissertation. In the proof we don’t consider the deviation from punishment which makes the problem more complicated. Maybe we will do some research on this special case in our further study. 48

7 References [1] Abreu, Dilip. On the theory of Infinitely Repeated Games with Discounting. Econometrica, 1988, 383-396. [2] Ayoshin D, Tanaka T. The Core and the Dominance Core in Multichoice Multistage Games with Coalitions in a Matrix Form. Proceedings of NACA98 (International Conference on Nonlinear Analysis and Convex Analysis), 1998. [3] Baker, George, Gibbons, Robert, Murphy, Kevin J. Relational Contracts and the Theory of the Firm. Quart. J. Econ. 2002. 2: 39-84. [4] Kaitala V, Pohjola M, Tahvonen O. Transboundary Air Pollition between Finland and the USSR: A Dynamic Acid Rain Game// Lecture Notes in Control and Information Sciences. Vol.157 (Proceedings of the Fourth International Symposium on Differential Games and Applications). Springer, 1991: 183-192. [5] Karlis S. Mathimatical Methods and Theory in Games, Programming and Economics. London: Pergamon Press, 1959. [6] Nicos Christofides. Graph Theorem. Academic Press, 1978. [7] Petrosyan L A, Gao Hongwei. Dynamic Cooperative Game. China: Scientific Press, 2009. [8] Petrosyan L A. Differential Games of Pursuit. Singapore: World Scientific, 1933. 49

[9] Petrosyan L A. Ayoshin D A. Values of Dynamic games with Partical Cooperation. Proceedings of the Insititute of Mathematics and Mechanics, Ekaterinburg, 2000, 6(1, 2): 160-172. [10] Petrosyan L A. Ayoshin D A, Tanaka T. Construction of a Time Consistent Core in Multichoice Multistage Games. Decision Theory and Its Related Fields, RIMS Kokyuroku, 1998, 1043: 198-206. [11] Petrosyan L A. The Regularization of NB-scheme in Differential Games, Dynamics and Control, 1995, 5: 31-35. [12] Petrosyan L A, Zaccour G. Time-consistent Shaply Value Allocation of Pollution Lost Reduction. Journal of Economic Dynamic & Control, 2003, 27: 381-398. [13] Susanne Goldlücke, Sebastian Kranz. Renegotiation-proof relational contracts. Games and Economic Behavior, 2013, Vol.80: 157-178. [14] Strotz R H. Myopia and Inconsistency in Dynamic Utility Maximization. Review of Economic Studies, 1955, 23: 165-180. [15] Yeung D W K, Petrosyan L A. Cooperative Stochastic Differential Games. New York: Springer-Verlag, 2006. [16] Yeung D W K, Petrosyan L A, Lee M. Dynamic Cooperation: A Paradigm on the Cutting Edge of Game Theory. Beijing: China Market Press, 2007. 50

Рецензии:

Авторизуйтесь, чтобы добавить рецензию

- у работы пока нет рецензий -

Отзывы:

Авторизуйтесь, чтобы оставить отзыв