Download Lecture #11 - people.vcu.edu

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Prisoner's dilemma wikipedia , lookup

Game mechanics wikipedia , lookup

Evolutionary game theory wikipedia , lookup

Artificial intelligence in video games wikipedia , lookup

Chicken (game) wikipedia , lookup

Transcript
November 20, 2002
Lecture #11
Reading: Tonight we will discuss chapter
10. For next time we will discuss some
special topics in chapters 11 and 12.
Homework: For next time: Chapter 10, 1,
2, 3, 4, 6, 7, 9, 10, 12, 13, 14, 15,
As usual, I will collect the bolded problems.
To collect: Extra Handout Sheet problems
#1 to #4.
Keys: Handout Sheet for chapter 9, problems, #1, #2, #3, #4.
Chapter 9: #1, #2, #5, #7 (skip Stackelberg),#9, #11, #13.
Review
Chapter 9. Oligopoly
A. Sweezy Oligopoly
B. Cournot Competition
C. Bertrand Competition
D. Contestable Markets
Chapter 10. An Introduction to Game Theory
A. Simultaneous Move One-Shot Games
Definitions: Strategy, Outcome, Payoff, Normal Form
Solutions: Dominant Strategy,Secure Strategy, Nash Equilibrium
Preview
A.(cont.) Applications of One-Shot Games
B Infinitely Repeated Games
C. Finitely Repeated Games (we will stop here tonight)
D. Multi-Stage Games
118
Lecture
2. Applications of One-Shot Games A variety of managerial problems be insightfully
analyzed as one-shot games. Here we consider four types of applications.
a. Dilemmas.
A Pricing Game. Consider the Bertrand Pricing game for a pair of duopolists,
firm A and firm B. Suppose, however, that the sellers are confined to just two choices, a
high price, and a low price.
If both firms charge the high price, each receives profits of 10. However, if firm
A charges the high price, firm B can undercut firm A by charging the low price, and take
the entire market. In this case profits for firm B increase to 50, and profits for firm B fall
to -10. Suppose firm A faces symmetric incentives vis a vis firm B. Finally, if they both
charge the low price each firm earns economic profits of 0.
The normal form representation of this game is game 2.
GAME 2
Firm A
Firm B
Low Price
0, 0
-10, 50
Low Price
High Price
High Price
50, -10
10, 10
Analysis: Notice that (Low Price, Low Price) is the unique Nash equilibrium. This
shouldn’t be surprising, since charging the low price is a dominant strategy for both
firms.
Notice also that both players would earn more if (High Price, High Price) were
played. Formally, (High Price, High Price) Pareto dominates (Low Price, Low Price).
That the Nash equilibrium is Pareto dominated by another outcome creates a dilemma.
Given the incentives in this game, the only way for them to play (High Price,
High Price) is for them to meet and agree on this outcome. Such meetings, however, are
illegal. Moreover, even if legal, there is the problem that each player has an incentive to
cheat on any agreement.
Advertising. Related incentives arise in other circumstances. Consider, for example the
decision to advertise. Suppose the relevant industry produces a known product in a
saturated market, so the only consequences of advertising are to take market share away
from others. (Examples include cigarettes and breakfast cereals): All firms would prefer
that no one advertise, thus increasing profits for all by saving advertising expenditures.
But if no one advertises, anyone can increase profits substantially by defecting,
advertising, and taking the entire market.
Product Quality. Similarly, oligopolists would prefer that everyone produce inexpensive,
low quality products. However, anyone who produces high quality would take the whole
market. So they all end up producing high quality.
119
b. Coordination Decisions. It’s not always the case that firms have competing interests,
as in the previous example. Important strategic situations arise even when firms have
similar interests.
For example, consider the competing protocols for high definition TV developed
by the Japanese and in the U.S. Suppose that the U.S. protocol is cheaper for broadcasters
to install, while the Japanese protocol is cheaper to place in TV sets. But in any case
HDTV manufacturers and broadcasters make profits only if they both use the same
protocol. In strategic form, this situation is represented as game 3.
GAME 3
Broadcasters
TV Mfgrs.
U.S. Protocol
Japanese Protocol
100, 80
0, 0
0, 0
80, 100
U.S. Protocol
Japanese Protocol
Observe that there are two Nash equilibria to this game. In fact broadcasters and
manufacturers have differing preferences as to which is selected. This is an example of a
coordination game.
Possibly this problem can be solved by meetings or by the government imposing a
standard. The more prominent point is that this is a problem of largely collective
interests.
c. Monitoring Employees. A third interesting type of problem arises in the principle-agent
context discussed in chapter 6. Consider the problem of monitoring, a potentially
effective (if negative) method of motivating effort.
Assume that employees would prefer to “shirk” if not caught, but will work hard
if monitored, in order to keep their jobs (but generating a benefit of -1 unit). The
manager, on the other hand, finds monitoring costly, and would like to monitor only in
the case that it was necessary. The incentives in this situation can be summarized as
game 4.
GAME 4
Manager
Worker.
Work
-1 1
1, -1
Monitor
Don’t Monitor
Shirk
1, -1
-1, 1
Benefits are denominated in terms of single “unit” benefits and losses. The manager loses
1 unit if (s)he monitors and finds the employee working, or if (s)he fails to monitor and
later discovers that the employee was shirking. (e.g., on the principle diagonal) Positive
unit payoffs are realized in the cases that (a) employees work without monitoring, and (b)
shirking is detected via monitoring (e.g., off the principle diagonal)
The employee has the reverse incentives. (S)he receives a one unit payoff if (s)he
shirks and get away with it, or if (s)he works when the monitor appears (e. g., on the
120
principle diagonal). The worker loses a unit if caught shirking, or if working and the
monitor doesn’t come by. (e. g., off the principle diagonal)
Consideration of the normal form game reveals that there is no pure strategy
equilibrium. In any outcome, one of the players can improve their earnings through a
deviation. Rather, the equilibrium is in mixed strategies. We won’t formally develop the
mixing equilibrium in this case, but it involves playing each outcome probabilistically.
The lesson from the analysis of this game is that to use monitoring to improve
performance in this type of circumstance, the monitoring must be probabilistic. (It’s a
little like a multiple choice test. If the professor selected a particular response with
inordinate frequency, students could guess the correct answer without understanding the
material)
d. Nash Bargaining. A final application of one-shot bargaining games involves
bargaining over a common prize. Suppose the prize is $100 in surplus, and union
members and management are bargaining over the distribution of the prize. For
simplicity, restrict the number of choices to three: $0, $50 and $100. In the game both
management and the union write a number on a piece of paper. If the amounts written
sum to no more than $100, each player gets the amount they wrote. If the amounts
written sum to more than $100, both receive nothing.
Incentives in this situation are summarized in game 5.
GAME 5
Management
Union
$0
$0,$0
$50, $0
$100, $0
$0
$50
$100
$50
$0, $50
$50, $50
$0, $0
$100
$0, $100
$0, $0
$0, $0
Consideration of payoffs reveals that there are three equilibria in pure strategies: ($0,
$100), ($50, $50) and ($100, 0). Given multiple equilibria, there are coordination
problems. Which should you select? First, you can probably rule out $0, since a play of
$0 is weakly dominated by $50 or $100. (That is, you would never do worse by picking
$100 or $50, than by picking $0, and you might do better.) Second, you probably could
rule out $100, since realizing a payoff would require that your opponent play a dominated
strategy.
$50 has two persuasive advantages. First, you have two chances of realizing a
payoff. Second, it is a natural focal point.
Application question: Suppose $1 is do be divided between two players, under the
conditions described above, but with offer possibilities restricted only to penny
increments. Is there an equilibrium to this game? What is it?
C. Infinitely Repeated Games. As a second class of games, we switch from the
assumption that a game is played only once, to the other extreme: Suppose that a game is
repeated an infinite number of times.
121
Formally, an infinitely repeated game is a game that is played over and over again
forever and in which players receive payoffs during each play of the game.
As we will see, the principle insight associated with infinite repetition is that the unique
equilibrium in the one-shot dilemma games can be improved upon. In fact, under
appropriate conditions, plays of the Pareto dominant outcome can be part of an
equilibrium strategy. The most immediate application of this result is understanding how
firms can successfully collude, given the incentives to cheat that arise in the one-shot
game.
1.Theory Since the game is repeated infinitely, strategy choices will involve weighing
earnings realized in different points in time. Due to the time value of money, this
requires a brief review of discounting.
a. Review of present value. Recall that the present value of profits t earned any
period t in the future is t divided by one plus the interest rate raised to the power t.
Thus, the value of a stream of profits earned in each of T time periods is:
T
P.V.Firm =
0 + 1 + 2 + ... + T = 
(1+i)0 (1+i)1 (1+i)2
(1+i)T
t
(1+i)t
t=0
If earnings are the same each period (t=  for any t), and if the horizon is infinite (T =
), the above expression is the sum of an infinite series. As long as i>0, this sum may be
simplified as
P.V.Firm = (1+i)
i
b. Supporting Collusion with Trigger Strategies. Infinite repetition creates additional
equilibria, because players can support desirable outcomes with threats to punish cheating
by playing the Nash equilibrium for the one-shot game (sometimes called the stage
game). These contingent strategy, or trigger strategy outcomes are most easily explained
in terms of an example. Consider again the simultaneous-move Bertrand pricing game
shown as game 2.
122
GAME 2
Firm A
Firm B
Low Price
0, 0
-10, 50
Low Price
High Price
High Price
50, -10
10, 10
Suppose this game is repeated infinitely. Now, consider the following trigger strategy.
Suppose each player does the following: In period 1, play the High Price. Then continue
playing High Price until the opponent defects and plays Low Price. Following the
defection, play Low Price forever.
Consider now payoffs in the repeated game. If player A cheats in the first period, earning
are $50 this period, plus $0 (in the Bertrand equilibrium) forever after, or
PV Cheat
FirmA = $50 + 0 + 0 + 0 + .......
On the other hand, if player A cooperates, the player earns $10 each period forever, or
PV Coop
$10 + ...... = $10(1+i)
FirmA = $10 + $10 +
0
1
(1+i) (1+i) + (1+i)2
i
Cooperation will be an equilibrium, as long as
Coop
PV Cheat
FirmA = $50  $10(1+i) = PV FirmA
i
Solving, it follows that cooperation is optimal as long as
i  1/4
or 25%.
This result is general, and may be summarized as follows:
Consider any one-shot dilemma game. Define Coop, Cheat, and N, as stage game
earnings for the Pareto dominant, defection and stage game Nash outcomes, respectively.
The Pareto dominant outcome is an equilibrium for the infinitely repeated game that is
supported by the trigger-strategy to respond to any defection with a play of N forever if,
Cheat - Coop  (1/i) (Coop - N)
This result should be intuitive. Cooperation is an equilibrium as long as the gains from
cheating today (the r.h.s.) are less than the benefits of continued cooperation in the future
(the l.h.s.)
Example: Consider game 2 once again. Suppose that the interest rate is 40%.
- What are firm A’s profits from cheating on a collusive agreement?
123
- What are firm A’s profits if it does not cheat on the agreement?
- Does an equilibrium result where each firm charges a high price each period?
2. Factors Affecting Collusion in Pricing Games. Many business relationships occur
indefinitely into the future. The above analysis suggests that, in stark contrast to the
predictions of the one-shot game, collusion should, in reality, be straightforward in these
games. This is not necessarily the case, however, as a number of factors impede the
establishment and maintenance of a cartel.
a) Number of firms. The preceding analysis considered the case of just two firms. The
costs of monitoring increases as the number of firms increases (each seller has to
independently monitor the actions of each other seller).
b) Firm size. Monitoring problems also increase as the number of sales outlets owned or
used by a firm increase.
c) Product/cost heterogeneities. Forming and maintaining a collusive arrangement also
becomes problematic as the number of dimensions that must be agreed upon and
monitored increase. Thus, firms with multiple products, or firms producing under
different cost conditions will find trigger-strategy equilibria more difficult to implement.
(Recall, collusion in the form of explicit discussions is illegal).
d) Punishment mechanism. As the number of players increases, not only is it difficult to
detect defections, but the punishment of defection is more difficult. Consider, for
example, a market with four sellers, firm A, firm B, firm C and firm D. Suppose that A
suspects B of cheating on an agreement to raise prices. “Punishment” by A is a lowering
the market price. Not only does the punishment of B hurt C and D and well, but C and D
may interpret the actions of A as a defection!
In some instances these problems can be surmounted, if firms have specialized
areas of trade or product niches. In the airline industry, for example, each firm operates
out of a “hub.” Competitors could punish defections by lowering price of travel in and
out of the defector’s hub.
3. Infinitely Repeated Games and Product Quality. The payoff-improving characteristics
of repetition arise contexts other than collusion between firms. Consider the problem of
producing and selling a good with quality characteristics that are indeterminate prior to
purchase.
Incentives for the stage game reveal a serious incentive problem: Firms would
make the most money selling low-quality products. Consumers, however, wouldn’t
purchase the products if they knew that they were of low quality. But producing high
quality can be risky for the firms: The firms would lose money from producing low
quality products that didn’t sell.
These incentives are summarized as game 6.
124
GAME 6
Consumer
Firm
Low-Quality
0, 0
-10, 10
Don’t buy
Buy
High-Quality
0, -10
1, 1
The firm loses nothing if low-quality products go unsold, and earns 10 if low quality
products sell. However, high quality products generate a profit of only 1 if sold, and
generate a loss of 10 if they don’t sell.
Consumers lose nothing from not buying, but lose 10 from purchasing a lowquality unit. Consumers also gain 1 from purchasing a unit that turns out to be high
quality.
For the stage game, the unique Nash equilibrium is (Don’t buy, Low-quality product).
However, if the game is repeated, consumers can induce high quality by “penalizing” the
firm for producing low quality units, by playing the stage-game Nash outcome (don’t
buy). Given a sufficiently low discount rate, this is an equilibrium.
Question: What rate of interest is necessary to sustain (Buy, High-quality product) as an
equilibrium?
Solution: The game is not symmetric, and only the seller has an incentive to defect. So
we restrict attention to the seller.
The gains from defection are 10 + 0 + 0 + .... = 10
The gains from cooperation are 1 + 1/(1+i) + 1/(1+i)2 + .... = (1+i)/i
Thus,
PV LowQ.
= $10  $(1+i) = PV HighQ.
Firm
Firm
i
Solving i  1/9. Note that you could solve this alternatively by using the formula
presented above.
Stage game gains from defection are 10
Stage game gains from cooperation are
Stage game gains in the stage-game N.E.
1
0
low Q - High Q.  (1/i) (High Q - N)
10 - 1  (1/i)(10 - 0).
The practical insight off this analysis is that it does not pay to deliver low quality if you
are a “going concern.” In fact, the warranty (and replacement) of occasionally defective
products is hardly “good will.” Rather it is essential to staying in business!
125
D. Finitely Repeated Games. So far we have confined attention to games of to extreme
varieties: Those that are played once, and those that are repeated infinitely. We now
consider the intermediate case of games that are repeated, but only a finite number of
times. With finite repetition, an important distinction arises that pertains to the game’s
endpoint: The endpoint may be known, or unknown.
1. Repeated Games with an Uncertain Final Period. In some instances, the terminal point
is not known. For example, the government may ban cigarette advertising in the
relatively near term, or a new innovation is likely to make obsolete an existing product,
etc. In these cases, analysis of “cooperative” trigger-strategy equilibria is very similar to
that in an infinitely repeated game, with an expression for the probability of termination
substituting for the expression generated above for the interest rate. To see this, consider
again game 2.
GAME 2
Firm B
Firm A
Low Price
0, 0
-10, 50
Low Price
High Price
High Price
50, -10
10, 10
This time, however, suppose there is a probability  that the market will disappear
prior to the next period (suppose due to a new generation of the product). Consider
conditions under which the (High Price, High Price) outcome can be supported by the
(Low Price, Low Price) Nash equilibrium for the stage game.
The gains from defection are
The gains from cooperation are
50 + 0 + 0 + ...
10 + (1-)10 + (1-)210 + = 10/
Thus cooperation can be sustained as an equilibrium for the repeated game if
10/

50
Notice that this is exactly the same rule as was generated before, except that
1/ takes the place (1+i)/i; players discount the future not because of the interest rate, but
because of the uncertainty that there is a future.
Example: Suppose that two cigarette manufacturers repeatedly play the following
simultaneous-move billboard advertising game. If both advertise, each earns profits of
$0. If neither advertises, each earns profits of $10 million. If one advertises and the
other does not, the firm that advertises earns $20 million and the other firm loses $1
million. If there is a 10% chance that the government will ban cigarette sales in any
given year, can the firms “collude” by agreeing not to advertise? (Work in class)
126
2.Repeated Games with a Known Final Period. Now suppose that the game is repeated
some known finite number of times. Consider again the pricing game (Game 2), but
assume that the game is played just two times. Is it possible that a “cooperative
equilibrium involving the (High Price, High Price) outcome is sustainable? The answer
is no. To see this, consider the second period. In this final stage, the game is essentially
a one-shot game, and the only equilibrium is the Nash equilibrium for the stage game.
Now, given the outcome of the last stage, consider behavior in the first stage.
Again the only equilibrium outcome is the single period outcome for the one-shot stage,
since there can be no opportunity to punish defections in the final period.
This same reasoning applies to any game that is finitely repeated: Even if the
game is repeated 1000 times, it will “unravel” from the last period backward as long as
there is a certain terminal period.
Applications of the End-of-Period Problem. This end of period problem has a
number of applications to managerial decision-making. We review two.
a. Resignations and Quits. Incentives for workers to shirk are usually overcome when a
game is indefinitely repeated (The cost of being caught shirking exceed any benefits of
shirking.) However, consider what happens when an employee decides to quit. In this
circumstance, the cost of being caught shirking are much lower than before, and shirking
might be expected.
What could a manager do to circumvent this problem?
i) Fire immediately all workers who indicate an intention to resign (Note, this
creates a problem for management: Workers who desire to quit will just not show up one
day!)
ii) Try to emphasize the long-term elements of employment to the worker (E.g.,
stress how important a good recommendation is to getting a good job, and indicate your
willingness to write a letter, etc.) This is a much better solution.
b. “Snake Oil” Salesman. In TV Westerns this vendor periodically appears, selling an
elixir with miraculous benefits. The townspeople are invariably taken in by the claims,
and purchase crates of useless (and possibly harmful) medicine.
Notice, that this problem arises because the consumers have no “punishment”
option. The itinerant salesman has no reputation to lose, and therefore the townspeople
can do little more than run all of them out of town. Notice, a legitimate permanent
pharmacist would qualify claims made for medicines and the appropriate uses, since (s)he
is interested in continued business.
We see problems like this in our world. Ever bought a Cartier watch from a
sidewalk vendor?
E. Multistage Games An alternative type of finitely repeated game. These games differ
from the others we have analyzed to date in that the timing of moves is critically
important.
127
1. Theory
These games are best analyzed in an alternative structure know as an extensive form
representation
Extensive-form game: A representation of a game that summarizes the players, the
information available to them at each stage, the strategies available to them, the sequence
of moves, and the payoffs resulting from alternative strategies.
Consider the following example
Up
A,B
(10,15)
B
Up
Down (5,5)
A
Up
Down
(0,0)
B
Down (6,20)
The circles are called decision nodes. At each decision node, a player must make a
decision.
As in the simultaneous move game, payoffs depend on the interaction of choices
by the two players. For example, the payoff to a down selection by B is either 5 or 20,
depending on whether A plays up or down.
The critical difference of the sequential game from the simultaneous move game
is that player A must make decisions prior to a choice by B. Thus, B’s choices are
conditional on A’s initial choices. The reverse is not true for A, however (Although the
payoffs for those actions depend on B’s choices)
.
For example: B’s strategy might be: Choose down if A chooses up, and choose down if
A chooses down.
In light of this strategy, A has the choice between earning 5 by play up and 6 by
playing down. The Nash equilibrium is 6, since neither player would unilaterally deviate
from the strategy.
There is another Nash equilibrium to this game.
B: Play UP if A plays UP, Play DOWN if A plays DOWN
A: Play UP.
Although these are both Nash equilibria, they are distinguishable. Consider the first
equilibrium in a bit more detail. Suppose A decided to deviate and play Up. Would it be
rational for B to play the announced “DOWN” strategy? No. The threat to play DOWN
in response to A’s UP is not credible.
128
The technical distinction between the two equilibria is that the second equilibrium is
subgame perfect.
Subgame Perfect Equilibrium A condition describing a set of strategies that constitutes a
Nash equilibrium and allows no player to improve his own payoff at any stage of the
game by playing strategies.
Thus, a subgame perfect equilibrium is one that only involves credible threats.
Subgame perfection is an important solution concept, and it has many naturally occurring
analogues. Consider the example in the text:
A father tells his teenage daughter that if she is not home by midnight, he’ll burn
the house to the ground (and her belongings as well). An equilibrium where the daughter
returns by midnight obviously arises. But is it subgame perfect? Will the father really
burn the house down? Probably not!
2. Applications of Multistage Games Games where the sequence of moves are important
(and consequently where the credibility of threats should be considered) have a number
of useful applications. We consider three.
a. The entry game. Consider a game in a market setting. Consider the case of a firm
considering entering a market that is presently serviced by a monopolist. The monopolist
is currently earning profits of $10 million per year. After making the entry decision, the
incumbent decides whether to compete aggressively (HARD), or accommodate the entry
(SOFT) monopolist. In the case that entry is met aggressively, the incumbent earns $1
million, while the entrant loses $1 million. In the case of accommodation, both earn $5
million.
129
The extensive form structure of this game is as follows: (A is the potential entrant)
Hard
(-1,1)
Soft
(5,5)
B
In
A
Out
(0,10)
What are the NE for the game?
a) B could threaten to play HARD. In this case A would stay out.
b) B could play SOFT. In this case entry would be an equilibrium
Notice that the threat implied in game 1 is not credible. Thus, only the latter equilibrium
is subgame perfect. (But consider the optimal choice by A if this game is repeated!)
b. Innovation. Consider a second industrial situation. This time, you are trying to decide
whether or not to introduce a new product. Suppose both you and a rival are currently
earning profits of $1 million per year. If you introduce the product, the rival could clone
your development, allowing the rival to earn $20 million. You would lose $5 million,
since you suffered the product development costs. Finally, if the rival doesn’t clone, you
earn $100 million, and the rival earns nothing. Should you introduce the new product?
Clone (-5, 20)
B
Introduce
Don’t Clone
(100, 0)
A
Don’t introduce
(1,1)
Notice that in this case there are again two Nash equilibria:
NE 1 B Clone if A Introduces,
A doesn’t introduce.
Equilibrium payoffs are thus (1,1)
130
NE 2 B:Don’t clone if A introduces
A Introduce.
Equilibirum payoffs are thus (100,0)
Notice, however, that in this case B’s “don’t clone response is not credible. Only NE 1 is
subgame perfect.
Notice also, that this circumstance is fairly typical of inventions: Unless developers can
recover their development costs, new entry will not occur. This is the reason that patents
are granted.
c. Sequential Bargaining. As a final application consider negotiations between labor and
management over the distribution of $100. For simplicity, restrict the possible divisions
to 3 (M, L): (99,1), (50,50) or (1, 99). Suppose that management gets to move first, with
the union responding. Suppose also that the surplus $100 vanishes if an agreement isn’t
reached after one sequence of moves, so in the case of a rejection of an offer by the
union, both parties get nothing.
The extensive form is illustrated as follows
A
($99,$1)
U
R
$1
(0,0)
$50
M
A
($50,$50)
U
R
$99
($0,$0)
($1,$99)
A
U
R
($0,$0)
Now, consider the position of the union. They may argue that they should receive
$99, and back this up with a threat to reject any other offer. In this case
The strategy pair $99, A is a NE.
Notice, however that this is not subgame perfect. In fact the only subgame perfect
equilibrium is to offer $1. There is a big advantage in a sequential move game to having
the first move (consider, for example, what would happen if the sequence of moves was
reversed.
131