Game Theory of Human Cooperation and Morality
ECON 113
Unit 1
Introduction
- Human life is competitive
- Job market, romance, toys, etc.
- Leads to social conflict
- Darwin provided a framework for copmetition
- Individuals inherit different traits, and there are limited resources
- Some traits are better for capturing resources - leads to evolution through natural selection
- Human life is also cooperative
- People work together, shop together, line up together, drive together, etc.
- Humans are more cooperative than chimpanzees; fights over food, strangers are killed
- Chimpanzees are the closest genetic relatives to humans; common ancestor was 6-7 million years ago
- Cooperation is puzzling
- Helping others can give others advantages and reduce your own chance of success
- Despite there being many opportunities for people to be selfish, they don’t take them
- Seems difficult to explain
-
Morality refers to the standards used to influence and judge decisions
- Acts as a duty to prevent against being selfish
- Constrains individual decision making
- Can have cooperation with and without morality
- Morality includes giving to others at a direct cost to yourself
- On the other hand, cooperation is working together with others and providing resources to gain the return from everyone’s work
- Humans are the only (observed) species with morality
- Other species have forms of cooperation; ant, chimpanzees
- Human parents teach children morals, allowed for by speech and advanced cognitive abilities
- Utilitarian consequentialism: The right action is the one that produces the most good for the most people
- Deontological ethcis: Actions should conform with behaviors that serve as universal laws
- Darwin showed that all humans blush - suggests that shame is universal and morality came about long ago in humankind’s evolutionary past
- Morality seems to be contradictory to natural selection
- Selfishness increases your chance of survival, so selfish people should have an advantage
- The puzzle of morality is that humans are moral despite natural selection seemingly going against morality
- Millions of years ago, we used to live in a dominance hierarchy
- Dominance hierarchy: Individuals are ranked highest to lowest; higher-ranked people use their strength to control access to food and mating opportunities, and lower-ranked people must wait
- Hinders cooperation since long-term research and production is disincentivized due to bullies being able to take away from lower-ranked individual
- Homo Sapiens lived in a reversed dominance hierarchy
- Bullies were policed and suppressed by group members (without formal police and courts)
- Homo Sapiens were more cooperative than modern apes; food sharing, hunting game, supporting children and the injured
- New cooperation is possible due to protection; can hunt more nutritious game and males are more likely to invest in their children due to more monogamy
- Other Homo species had the reversed dominance hierarchy
- Hominin species evolved to become bipedal, less conflict (for mates), stone tools, weapons
Course Goals
- Study of proximate causes and ultimate causes
-
Proximate causes are immediately responsible for some behavior or event
- Individual level, e.g. an individuals preferences and beliefs
-
Ultimate causes are deeper forces that explain proximate causes
- Population level, result of evolutionary processes
- Is it instrumentally rational to be cooperative?
- Population level, result of evolutionary processes
- Instrumental: Acting intentionally to achieve certain outcomes
- Only outcomes matter
- Why are humans so cooperative compared to our closest cousin species?
- Two key proximate forces: Cooperative dispositions and morality and advanced cognitive abilities
- Humans are able to predict others’ actions, reflect on their decisions, and discuss morals
- Leads to suppression of bullies in hunter-gatherer groups and law in modern times
- How did human cooperation and morality evolve?
- Evolutionary pressures led to changes along the hominin line
- Advances in cooperation and morality allowed for selection of cognitive ability and morality
-
Proximate causes are immediately responsible for some behavior or event
- Game theory provides tools to study cooperation and morality
- Classical game theory explains proximate causes, evolutionary game theory explains ultimate causes
- Will study rational deliberation, reciprocity in repreated interaction, assortative matching, in-group favoritism, group selection
- Proximate question: Why are humans so cooperative?
- Ultimate question: Why did humans become so cooperative?
10 Precepts
- Humans manifest unique and puzzling forms of cooperation and morality
- Cooperation and morality can be studied using game theory
- Selfish motivations generate social dilemmas in which individual actions undermine social welfare
-
Homo Economicus can achieve self-enforcing cooperation using rewards and threats with a sufficient chance of future interaction
- Homo Economicus is the representation of a selfish, rational human
- Self-enforcing cooperation in repeated interaction among Homo Economicus extends to settings with indirect reciprocity, exclusion, and imperfect monitoring
- Human behavior is better represented by Homo Normist than Homo Economicus because it balances conditional rule following and selfishness
- Homo Normist is a conditional norm follower; they will follow rules if others will
- Norm following cannot be explained by consequentialism
- Norm followers must care about rules, not just outcomes
- Evolution can select for norm-following preferences and conditionally-cooperative norms
- Positive assortative matching generates a strong evolutionary advantage for cooperators
- Evolutionary selection favors norms of in-group favoritism and highly-cooperative groups in group competition
Games and Notation
- Basic game: Prisoner’s Dilemma
- Highlights Puzzle of Cooperation due to players being disincentivized to cooperate; Nash equilibrium is Pareto inefficient
- Games are strategic situations: multi-person, interactive-decision setting where one individual’s actions affect another’s well-being
-
Normal (Strategic) Form: Describes a game using three things
- A set of individuals (players)
- A set of actions (strategies) for each player
- Player’s preference over outcomes resulting from the choices
- A game matrix captures all three parts
- Limited in scope; impossible to represent games with infinite choices
- The Nash equilibrium occurs when all actors have a best response in the same cell of the game matrix
- Normal-form games are one-shot games where players choose their strategy and follow through with it
- Notation for a normal form game
- Set of players: $I = {1, 2, \ldots, n}$
- Player $i$ chooses a strategy $s_i$ from a set of strategies $S_i$
- Each player $i$ has a utility $u_i(s)\in \mathbb{R}$ for each ${s = (s_1, s_2, \ldots, s_n)}$ from ${S=S_1 \times \ldots \times S_n}$
- Shorthand for $s$: $s = (s_i, s_{-i})$, where $-i$ represents all players other than $i$
- Game theory assumes that invidiuals act rationally
- Well-defined goals, well-defined utility function
- Rational agents is known as a methodological assumption used to study human behavior
- There are rational preferences (consistent choices) and rational beliefs (evidence-based beliefs)
- Behavior can be instrumental or non-instrumental, selfish or unselfish
- Homo Economicus is the main model for agents; they are instrumental and selfish, or intentional, consequential, and selfish
- Some actors might not be Homo Economicus, but they will always be assumed to be rational
- Rational actors maximize utility, where utility is defined as some individual material payoff
- Hard to achieve cooperation with Homo Economicus; if cooperation is achieved, then it can be achieved in many other situations
- Pure strategies are strategies that are selected at the beginning and followed without deviation; nonrandom
-
Mixed strategies are strategies in which players “mix” over pure strategies, choosing pure strategies with random probabilities
- Denoted as $\sigma_i$; if $S_i = {s_{i1}, \ldots, s_{im}}$, then $\sigma_i = (p_{i1},\ldots,p_{im})$, where $p_{ij}$ is the probability that player $i$ chooses strategy $j$
- $\sum_j p_{ij} = 1$
- Pure strategies can be thought of as a subset of a mixed strategy where all probabilities (except for one) are set to 0
-
Von Neumann Morgenstern preferences state that rational players maximize their expected utility
- The best mixed strategy maximizes the expected utility; assumes that mixed strategies are independent from each other
- Expected utilities require cardinal utilities
-
Solutions in games are special strategies
- Predicts what will be played
- Empirically valid; played in reality
- Mathematically salient; has special properties
- Defined by a solution concept that has mathematical criteria
- There are different solution concepts used for different games
- The best response (BR) for an individual is defined as $s_i^$ such that ${u_i(s^i, s{-1})\geq u_i(s’i, s{-1}), \forall s_i’\in S_i \backslash s_i^*}$
- Solutions in classical game theory assumes that individuals play best responses
- BRs can vary based on other players’ strategies
- A strategy strictly dominates another strategy if the utility for the dominating strategy is always better than the utility for the dominated strategy
- A strictly dominant strategy is defined as
- A strictly dominated strategy is defined as
- A dominant-strategy solution is when all players have a dominant strategy
-
Iterative Elimnation of Dominated Strategies (IEDS) is a strategy to find solutions to games
- Find dominated strategies
- Remove them from the pool of options, as they should never be chosen
- Repeat steps 1 and 2 until there are no more dominated strategies
- A game is dominance solvable if IEDS yields a single strategy profile
- IEDS requires Common Knowledge of the Game (CKG): all players know the structure of the game, and all players know that all other players know the structure of the game
- CKG is always assumed
-
Common Knowledge of Rationality: Each player knows that other players choose best responses, and all players know that all others know that others choose best responses
- Without CKR, players are unsure if dominated strategies will be elimated
- CKR and CKG are higher-order beliefs
- A strategy profile $s^$ is a (pure) Nash Equilibrium (NE) if ${\forall i\in I, s’_i\in S_i\backslash s^i, u_i(s^*_i, s^*{-i})\geq u_i(s’i, s^*{-i})}$
- A NE is a strategy profile with mutual BRs; everyone is playing a best response
- Games can have 0 to many NE, and some games can have non-pure NE with mixed strategies
- A dominant-strategy solution must be a NE, but not every NE is a dominant-strategy solution
- NE properties
- Each player is a rational actor choosing a BR
- NE can be reached using IEDS
- Players in a NE do not have deep regret
- Different processes can lead to NE
- Nash proved that all solutions have an NE, either mixed or pure
- NE has predictive power
- NE shortcomings
- Might only have mixed NE
- Must choose one NE; equilibrium sleection
- Instrumental players must have correct beliefs about others’ strategies
- This means that experimentally, people do not reach NE
- A mixed-strategy profile $\sigma^$ is a Mixed Nash Equilibrium iff ${\forall i\in I, \sigma’_i \in \Delta(S_i)\backslash \sigma_i^, u_i(\sigma^_i, \sigma^{-i}) \geq u_i(\sigma’_i, \sigma^*{-i})}$
- Players should only mix over two strategies if they give the same expected utility; use this fact to find the mixed NE
- A mixed NE can dominate a pure NE, allowing for IEDS
Social Dillemas
- Social dilemmas are social interactions (games) in which individual incentives result in an inferior social outcome for the players (Pareto inefficient)
- Games like the Pure Coordination Game have no conflicts of interest for either player; both gain utility simultaneously
- Games like the Matching Pennies Games have a pure conflict of interest; for one player to gain utility, the other has to lose utility
- Games like the Prisoner’s Dilemma has both coordination and conflict; represents a social dilemma
- Can also be thought as a mixed-motive game; players have an incentive to ciirdubate abd move towards an efficient optimum, but once there, they also have an incentive to shirk
Public Good Game
- Set of players $I = {1, 2, \ldots, n}$
- Each player chooses $s_i \in S_i \text{ s.t. } S_i = [0, \bar{s}]$
- Each player has a utility function $u_i = m\sum_{j\in I}s_j + (\bar{s}-s_i)$
- Can be thought of as the marginal returns to the total contributions plus the player’s remaining budget
- Can be rewritten as $u_i = m\sum_{j\neq I}s_j + \bar{s} - (1-m) s_i$
- Each player’s best response is to set $s^*_i = 0$, as an increase to $s_i$ decreases $u_i$
- Total social utility in the NE is $U^* = n\bar{s}$
- The social optimum comes from each player choosing $s_i=s$, making the total social utility $U = \sum (m\sum s + (\bar{s} - s)) = n(mn-1)s + n\bar{s}$
- $U$ is increasing in $s$ if $n(mn-1)>0 \rightarrow m>\frac{1}{n}$
- If $m>\frac{1}{n}$, then the social optimal strategy is to contribute $\bar{s}$, making the social optimal utility $mn^2\bar{s}$, much larger than the NE social utility
- Main issue is the free-rider problem; the individual marginal return is lower than the individual marginal cost, so no one is incentivized to contribute
Tragedy of the Commons
- Has two players and a common property of size $y>0$
- Each $i\in I$ chooses how much to consume today (denoted by $c_i\in [0, \frac{y}{2}]$) and then split what is leftover tomorrow to maximize their utility
- $u_i = \ln c_i + \ln(\frac{y-c_i-c_j}{2})$
- The first order condition implies that the BR function is $c_i^* = \frac{y-c_j}{2}$
- NE becomes
- Both players consume $\frac{y}{3}$ today and $\frac{y}{6}$ tomorrow
- Socially optimal strategy is to consume $\hat{c}_i = \frac{y}{4}$ today; both players consume $\frac{y}{4}$ today and $\frac{y}{4}$ tomorrow
- Better than the NE total social utility
- Main issue is the Tragedy of the Commons; individuals are incentivized to over-consume today because everyone else will, hurting their utility tomorrow
- Gets worse as more players are added
Contests
- Two individuals $i\in {1,2}$ and efforts $e_i\in \mathbb{R}^+$
- Probability of winning $p_i(e_i, e_j) = \begin{cases}\frac{e_i}{e_i + e_j}, & e_i + e_j > 0\ \frac{1}{2}, & e_i + e_j = 0\end{cases}$
- Prize value $v>0$
- Utility $u_i = p_i(e_i, e_j)v - e_i$
- BR function: $e_i^* = \sqrt{ve_j} - e_j$
- NE becomes $e_i^* = \frac{v}{4}$
- The NE utility for $i$ is $\frac{v}{4}$; the contest intensifies as the prize increases
- The socially optimal solution is to choose $\hat{e}_i = 0$, as the utility for both players will be $\frac{v}{2}$
- Another way to see this is by looking at the prize vs. effort exerted; total prize is $v$, but the contestants exert a total of $\frac{v}{2}$ effort, so they effectively “pay” for half of the prize
Chicken Game
- Two players decide whether or not to fight one another, can either be a chicken or a fighter
- If both players fight, they both suffer large damage; if neither fights, then they both survive with no loss
- The two pure NE are for one player to fight and another to be a chicken
- Can be thought of as a type of contest where you can either give full effort or none
- Leads to the chicken game also being a coordination game whereas the contest is not
Stag Hunt Game
- Also known as the Assurance Game
- Basic context: multiple people are hunting a stag, but everyone needs to be focused to kill the stag; if one player goes off to kill a hare, then no one gets the stag (stag gives more utility than hare)
- Coordinating to kill the stag is Pareto optimal, but you can only choose to kill the stag if you have assurance, or trust, that the others will also kill the stag
- Players will only kill the stag if they payoff for the stag is vastly greater
- Assuming player 2 is playing a mixed strategy, player 1 will choose to kill the stag if and only if $q > \frac{1}{x}$, where $q$ is the chance player 2 will choose $S$ and $x$ is the payoff of $S$
- The risk-dominant strategy is to mix and choose $S$ with probability $\frac{1}{x}$
- Risk-averse players will play a maxmin strategy (maximizing the minimum payoff), so they will always choose $H$
- Assurance games are social dilemmas due to the lack of assurance
Incentivizing Cooperation
- To achieve cooperation, the game itself must be changed (strategy set, players, utility functions)
- Changing the set of players: change the players’ preference (Homo Economicus to Homo Altruist), add players (like police), exclude players
- Changing strategy set: repeat the same interaction, introduce new ways of interacting
- Changing utilities is implied when doing the above actions
- Typically, we will keep Homo Economicus and change other assumptions
-
Reciprocity: Receiving an action and doing the same action back to the other player
- Requires timing of choices, knowledge of others’ choices, etc.
- We will use an extensive-form game to model this
- Reciprocity relies on history-dependent strategies where players remember what happened earlier in the game
- Trigger strategy: A strategy where one player instantly punishes the other in response to a non-cooperative behavior
- History-dependent strategies rely on monitoring where players observe other players’ actions
- Better monitoring = more accurate strategies
- External third party actors can induce cooperation, typically by using trigger strategies (think of police)
- Self-enforcing cooperation is cooperation without external enforcement; the Puzzle of Cooperation asks why people cooperate without these externalities
- Punishing can be thought of as a public good game since punishment requires resources
- Known as the Second-order Public Good Problem since external enforcement is a public good
- This means that external enforcement also requires self-enforcing cooperation due to second-order public good problems
Unit 2
Extensive Form Games
- Sequential games can provide advantages to players arbitrarily based on move order
- Chicken Game has first-mover advantage, Matching Pennies has last-mover advantage, PD has no advantage
- Dictator Game: Only one player has an option, meaning that they have a first-mover advanage
- A game tree represents sequential actions
- Starts at a decision node, and each decision node has branches that represent possible choices
- Terminal nodes have payoff profiles
- Trees are acyclic
-
Extensive Form Games include more information than normal form ones
- Set of players
- Order of moves/decisions
- What each player’s possible decisions are at each decision node
- What each player knows about prior moves when making a chioce
- Utilities dependent on moves
- Probability distributions over random events
- The extensive form is a complete representation of a game; a normal form is a (useful) abstraction since normal form is a subset
- An information set represents what a player knows when making a move
- Represented in a game tree using a circle/oval/rectangle; whatever is outside of the circle is outside of the information set
- Depicted as $\mathcal{I}_i = {{a}, {b, c}, \cdots }$
- A game with perfect information means that all information sets have one decision node
- Implicit: simultaneous games cannot have perfect information
- A pure strategy in the extensive form is a complete plan of moves, where one move is planned for each information set
- A pure strategy is chosen at the start of the game and followed throughout, regardless of if an information set is unreachable or not
- A mixed strategy is a randomization of pure strategies, just as in normal-form games
-
Sequential rationality: A rationality where players look ahead and predict future moves; players assume that BRs will be chosen in all future moves
- Sequential rationality is depicted using backward induction (BI) in perfect information games
-
Kuhn’s Theorem: Every finite extensive-form game of perfect information has a BI solution
- This means that every BI solution has a corresponding weak IEDS NE
- BI requires common knowledge of the game, common knowledge of rationality, and a large cognitive load
- Players might not use BI if these requirements are not met
-
Subgames are studied to solve games with imperfect information
- Subgames start from a single decision node, contains all of the later decision nodes and information sets, and maintains the original information sets
- Can be thought of as a extensive-form game that exists in another extensive-form game
- Each single decision node forms a new subgame; games with imperfect information will have a number of subgames less than the number of decision nodes
- A Subgame Perfect Equilibrium (SPE) is an equilibrium where each subgame has an NE
- SPEs can be found in a game with imperfect information
- A BI solution is a SPE solution of a perfect-information game
- Caputres sequential rationality
Dictator and Ultimatum Game
- Dictator Setup: Player 1 chooses $s_1 \in [0, \bar{s}]$ to give to player 2
- Using Homo Economicus preferences, $u_1 = \bar{s} - s_1$ and $u_2 = s_1$
- Player 1 will always choose $s_1^* = 0$
- Ultimatum Setup: Player 1 chooses $s_1 \in [0, \bar{s}]$ to give to player 2, and player 2 must accept the offer
- Player 2 will accept any offer since it will be better than or the same as rejection, so player 1 will only choose to share 0
- These games show that Homo Economicus will not share any money in the Dictator Game and will accept any offer in the Ultimatum Game
- No/very little money is shared in both
Repeated Games
- Many interactions are repeated, and threats of punishments/bad responses in future periods dissuade short-run selfishness
-
Repeated games are a subset of extensive-form games
- Involves stage games (which is an extensive-form game in of itself) that are played multiple times
- Players observe + remember the moves from each stage
- There are $T$ stages, and the utility is the sum of the utilities at each stage (possibly with a discount)
- Not easily represented with a game tree; too large
- Better represented with the stage game and the number of pretitions $T$
- Repeated games can be infinite $(T=\infty)$ or finite
- Finitely repeated games with a unique stage-game NE must have a unique SPE, as backward induction can be used
- Implies that cooperation is not possible since players (Homo Economiucs) will try to be selfish at the last stage (and consequently all stages before that)
- If the stage game does not have a unique NE, there will be multiple history-independent SPE
- Not of note; everything is independent, and no reward/punishment strategies are employed
- Finitely repeated games with a unique stage-game NE must have a unique SPE, as backward induction can be used
-
History-dependent strategies bases future behavior on the past
- One shot deviation principle: A strategy is a BR (and SPE) if there is no benefit to deviating in any singular stage of the game
- The SPEs from history-dependent strategies might not follow the stage game NEs and also can outperform history-independent strategies
-
Trigger strategies involve punishing non-cooperative behavior by threatening to be non-cooperative in all future periods
- Will also reward cooperative behavior with future cooperation
- Incentives sacrificing short-term gains for long-term gains
- Requires multiple NE in stage game and rewards/punishment greater than defection gain
Infinitely Repeated Games
- In these games, $T=\infty$
- Does not mean the game will actually go on forever; players simply don’t know when the game will end
- Payoffs are discounted for future stages since payoff in the future is not realized and thus has less importance than present payoffs
- The discount factor is denoted as $0 < \delta < 1$; overall utility is $U_i = \sum^\infty_{t=0}\delta^tu_{it}$
- Short form sum: $U_i = \frac{1-\delta^T}{1-\delta}v = \frac{1}{1-\delta}v$ if $T\rightarrow\infty$
- Discount factor is assumed to be the same across all players
- The discount factor is denoted as $0 < \delta < 1$; overall utility is $U_i = \sum^\infty_{t=0}\delta^tu_{it}$
- A one-shot deviation from a strategy $s_i$ is a strategy $s_i’$ where you only deviate in one period
- The One-Shot Deviation Principle states that a strategy profile is an SPE iff there are no profitable one-shot deviations
- Similar to finite games, history-independent SPE will not induce cooperation since they are only comprised of (selfish) NE
- Must use history-dependent strategies to get cooperation
-
Grim Trigger Strategy: Keep cooperating until the other player does not cooperate; then, only defect
- Key properties: initial and potentially infinite cooperation, infinite punishment after defection
- Two phases: cooperation phase and punishment phase, can calculate benefit (and necessary $\delta$ for cooperation) from deviating in the cooperation phase
- Employs direct reciprocity to sustain cooepration; works when there is a low punishment payoff and a high discount factor
- The Grim Trigger strategy is known as a Nash-Reversion strategy because it has a history-dependent phase (cooperation) and a history-independent phase (punishment)
- Nash reversion ensures an SPE since the history-independent period will be guaranteed to be an SPE
- Intuitive; cooperate, and then punish by not cooperating using NE actions
- May require large discount factors, but will always exist
- By adding infinite periods, non-cooperative games (like PD) can be turned into coordination games with a cooperation equilibrium
- A minmax punishment punishes a player for not cooperating by guaranteeing the defector receives their minmax utility; i.e. the minimum utility possible assuming the defector is maximizing their utility
- More extreme than Nash reversion, as the minmax punishment might not be a NE
- Requires a lower discount rate in general due to the harsher punishment, leading to more cooperation
- Might not be a SPE since the punishment because the minmax punishment might not be a NE
- Can be a NE without being a SPE
- Punisher may want to go deviate back to the NE in the stage game if the payoff is higher; makes minmax punishment less credible
-
Indrect reciprocity is when a third party reciprocates based on the interactions between two separate players
- Example: Infinitely-Repeated Random-Matching PD, where there are $n\geq 4$ players who are randomly matched with each other
- Direct Grim strategy: Play C when you are paired with someone for the first time, and keep playing C if they played C; play D otherwise
- Requires a discount rate of $\delta \geq \frac{n-1}{n}$ since you are unlikely to play the same person multiple times
- Indirect Grim strategy: Play C at first, and play C if your partner played C in all prior rounds; play D otherwise
- Requires a discount rate of $\frac{1}{2}$ instead
- Direct Grim strategy: Play C when you are paired with someone for the first time, and keep playing C if they played C; play D otherwise
- Relies on perfect observation and memory, but can still work with imperfections
- Humans rely on communication (gossip) to share information
- Example: Infinitely-Repeated Random-Matching PD, where there are $n\geq 4$ players who are randomly matched with each other
- Cooperation can be sustained by excluding others
- Rivalrous: A good can only be consumed by one person
- Excludable: A person can be prevented from consuming a good
- Club goods are non-rivalrous, excludable goods that are produced by clubs (families, gyms, church, etc.)
- The threat of being kicked out of a club can incentivize cooperation and leave the punishers even better off
- Requires a relatively low cost compared to the number of members in the club
- If cost is too high, then punishers will go back to the standard Grim strategy
- Cooperation can still be sustained even with imperfect monitoring such that not all actions are observed
- Example: The probabilistic Convex Public Good game, where $n’$ of the $n$ players are selected at random and monitored
- Probability of being observed: $m = \frac{n’}{n}$
- Discount factor $\delta \geq \frac{1}{1+m}$, higher discount factor that approaches $\frac{1}{2}$ when $m\rightarrow 1$ and $1$ as $m\rightarrow 0$
- Monitoring gets worse if the group gets bigger
- Exclusion also works with imperfect monitoring
- Example: The probabilistic Convex Public Good game, where $n’$ of the $n$ players are selected at random and monitored
-
Folk Theorem: Cooperation is possible with large discount factors
- To prove Folk theorem, there are three parts
- Identify harshest punishment (minmax)
- Identify what payoffs are possible that are better than the harshest punishments (include mixtures)
- Show that the latter can be sustained with high enough discount factors
- Individually rational profiles will have a reward/punishment from the minmax punishment
- Coordination on some point $x$, where a player will get $z$ from defecting and $y$ in all subsequent periods, requires a discount factor $\delta\geq\frac{z-x}{z-y}$
- This can be rewritten as $\frac{\delta}{1-\delta}(x-y) \geq z-x$ which represents the future sum of discounted rewards vs. the defection benefit today
- As $\delta\rightarrow 0$, LHS becomes 0, so no cooperation
- As $\delta\rightarrow 1$, LHS becomes infinity
- There must exist a “sweet spot” $\delta^* \equiv \frac{z-x}{z-y} \in [0,1]$ to sustain cooperation, proving Folk Theorem
- Does not guarantee coordination
- To prove Folk theorem, there are three parts
- Self-enforcing cooperation has various requirements
- Infinite repetition
- Sufficient detection of defectors (monitoring)
- Strong punishment for defectors (minmax, Nash reversion)
- Punishments that are credible and not too costly for the punisher
- Large future rewards from cooperation (high discount factors)
- If any of these requirements are not met, then Homo Economicus will not cooperate
- Because there are various trigger strategies, a failure to coordinate on the cooperation-sustaining strategy will result in non-cooperation
Unit 3
Experimental Economics
-
Experimental economics is the practice of designing and carring out experiments to test economic theories of behavior
- Requires a large sample of subjects, random assignment, control, and replication
- Controlled experiments are ones where the experimenter assigns treatments; uncontrolled experiments are observational with unassigned treatments
- Economics experiments use simple design, consistent instructions, neutral language, monetary incentive (for utility functions), no deception
Experimental Results
- One-shot
- Ridinger and Mcbride (2024): 49% of UCI students choose to cooperate in the one-shot PD game (typically 40-50% for American college students)
- Ensminger (2004): Kenyan men choose to contribute Linear Public Good Game, but HE does not contribute anything (typically 40-60% for American college students)
- Forsythe et al (1994): Many American college students share in the Dictator Game, and they share even more with hypothetical money
- These results show that experimental subjects will cooperate and give at higher rates than Homo Economicus
- Sequential
- Forsythe et al (1994): Offers are larger in the Ultimatum Game than in the Dictator Game because recipients will reject lower offers despite it always being better to accept
- Oosterbeek et al (2004): Subjects around the world reject lower offers
- In sequential PD, more first movers cooperate than in one-shot simultaneous PD, and a plurality of second movers are conditional cooperators
- These results show that econd movers are willing to pay a cost to reward a first-mover cooperator and/or punish a selfish first-mover
- Ridinger and Mcbride (2024): First movers will cooperate if they expect conditional cooperation
- These results also show that subjects will cooperate at higher levels to gain a reward or avoid a punishment
- Forsythe et al (1994): Offers are larger in the Ultimatum Game than in the Dictator Game because recipients will reject lower offers despite it always being better to accept
- Repeated
- Dal Bo (2006): In PD, subjects are more likely to cooperate when there are more rounds, either in indefinite horizon or finite horizon
- Matches inuition of Folk Theorem, but also shows that subjects don’t perform backwards induction (always defect) like Homo Economicus
- Lugovskyy et al (2017): In the Linear Public Good Game, most people will cooperate in early rounds and then start to contribute less regardless of whether or not the game is finite or indefinite
- Cooperation tends to decline within repeated matches; declines faster in finite horizon games
- Contests
- Chowdhury et al (2012): Subjects will either exert excessive or zero effort in a contest, contrasting HE who will only exert a medium amount of effort to avoid significant loss
- Ranges from socially efficient or very socially inefficient
- Efforts stay at the same level over time
- This shows that real humans are even less cooperative than Homo Economicus in contests
- Chowdhury et al (2012): Subjects will either exert excessive or zero effort in a contest, contrasting HE who will only exert a medium amount of effort to avoid significant loss
- Differences in strategies can be ascribed to culture, gender, religion, etc
- Shows that there is a lot of variation (heterogeneity) in individuals
- General lessons
- Humans care about money
- Humans care about more than just money because they cooperate at a higher rate than Homo Economicus
- Values differ across settings, i.e. Public Good Game vs. Contests
- Human cooperativeness depends on expected rewards and punishments
- Humans differ in how they trade off selfish and prosocial values
- Humans are often conditionally cooperative
Social Norms and Rule Following
-
Social norm: A behavioral rule in a specific community that defines acceptable and unacceptable behavior
- Can either be prescriptive (do a certain action) or proscriptive (don’t do an action); defines moral restraint
- Social norms are deontic: individuals follow the rule as opposed to the consequence
- People following social norms are NOT Homo Economicus because HE are consequentialist
- Norm violation has both an external and internal punishment
- Punishment can be conditional; if others are violating the norm, the punishment won’t feel as bad
- Preference is to follow the norm, not the norm itself
- Norms can be formal (intentionally designed) or informal
- Norms can be conventional (small punishment, e.g. trends) or moral (unconditionally followed)
- Highlights the values of a community
-
Socialization: The process of learning and adopting the norms of a community
- Socialization comes from rewards and punishments from following or not following norms, leading to an individual internalizing and conforming to the norms
- Primary socialization is the development of the ability to learn about norms and judge actions
- Secondary socialization is the learning of norms of a specific community
- Cooperation rates will increase when people have internalized norms
- No other species has the capacity to internalize norms
- Kimbrough and Vostroknutov (2016) used a traffic simulation to show that people are unwilling to disobey the rules to get a monetary benefit
- Groups of rule followers were better able to sustain cooperation
- McBride and Ridinger (2021) showed that fule following is conditional on other’s compliance
- People who conformed to rules also gave the most in the dictator game
Homo Normist
- Norm compliance can be included in a utility function
- Define acceptable and unacceptable strategies: $A_i \subseteq S_i$
- Meaningful norm: $\empty \subset A_i \subset S_i$
- Utilities are more than material; i.e. norms are included in the utility function
- Material payoff: $y_i(s_i, s_{-i}) \in \mathbb{R}$
- Homo Normist preferences: $u_i(s_i, s_{-i}) \neq y_i(s_i, s_{-i})$
- Has internalized penalty $u_i(s_i, s_{-i}) = y_i(s_i, s_{-i}) - \rho_i k_i$ if $s_i \notin A_i$
- $\rho$ is the community reference of accepttable behavior, $k$ is the individuals guilt
- Define community compliance reference
- $\rho$ can be the compliance of other players (internal), rate of compliance in the larger community (external), and/or personal belief about compliance (subjective)
- Depends on the setting
- Define acceptable and unacceptable strategies: $A_i \subseteq S_i$
- Homo Normist is both deontic and consequentialist; they compare the best acceptable strategy and the best unacceptable strategy
- Leads to a tradeoff between rule following and maximizing material payoff
- Homo Normist is experimentally proven and rational (because their choices are consistent)
Issues and Considerations
- If people were not consequentialist, then norm following would not be irrational
- Functionalist fallacy: The conclusion that the outcome of an action is the reason for the action
- Humans have a preference for following the rules and being Homo Normist as opposed to Homo Economicus or Homo Deontist (preference-selection problem)
- There are many types of norms, but humans mostly select cooperation norms (norm-selection problem)
- Both of the above problems are answered by evolutionary selection
- Rule followers can often receive better payoffs than the selfish Homo Economicus; the instrumentalist might become a rule follower to maximize their payoffs
-
Rule consequentialism is a philosophy where one follows and makes rules that maximizes outcomes
- Instrumentalists might be rule consequentialists at first to be better off, but they are motivated to break the rules after a cooperative equilibrium is reached; immoral
- Does not fully explain why people value rule following
- Positive analysis examines actual phenomena (what is) while normative analysis focuses on values (what ought to be)
- Evolutionary analysis uses positive analysis to understand how people came to follow rules
- Cannot use experiments to verify analysis; can only create plausible and compelling analysis + conclusions
- Inherently dynamical due to the stochastic aspect, might not have an equilibrium
- There is no one reasonable norm, as many seem equally good
- Worst in asymmetric cooperation games and can make players worse off
- Communication does not aid this issue if some outcomes are better for others
- Norms can be measured through efficiency (Pareto improvements), welfare, rights, or harm
- A Pareto superior (efficient) norm $x$ provides utilities such that $u_i(x)\geq u_i(y) \forall i$ and $\exists i, u_i(x)> u_i(y)$
- Inefficient norms can make players worse off; known as winner-and-loser norms
- A welfare norm $x$ is one where $\sum_i u_i(x) > \sum_i u_i(y)$
- Norms can start as good and go to bad
- A Pareto superior (efficient) norm $x$ provides utilities such that $u_i(x)\geq u_i(y) \forall i$ and $\exists i, u_i(x)> u_i(y)$
- Rule-following requires added complexity and parameters
- Individual beliefs can update until an equilibrium is reached; still creates a NE
One-Shot Homo Normist Games
- In the Dictator Game, different norms can lead to the dictator sharing
- The community reference for player 1 does not come from player 2 since player 2 has no action; it comes from people outside of the game
- Assuming that the total amount of money is 1, equality norm: $A_i = {0.5}$; full-sharing norm: $A_i = {1}$
- With more forced sharing, then players are required to have a higher norm salience or a greater community reference
- The most preferred acceptable action will always be chosen out of the acceptable actions
- In the PD Game, the community reference comes from the other player (internal) because it is symmetric
- Depending on the values of $k$ and $\rho$, the one-shot PD Game can become a Coordination Game
- With common knowledge of the norm, then both players will know the other player will want to follow the norm, allowing for both cooperation and coordination
- If one player is Homo Economicus or if one player has a low norm salience, then cooperation disappears
- With an external community reference, cooperation can become a NE
- Depending on the values of $k$ and $\rho$, the one-shot PD Game can become a Coordination Game
- In the Ultimatum Game, both players will have norm salience, leading to cooperation
- If $A_2 = \text{Accept if } s_1 > 0.5$, then they will accept if $s_1 > \min(0.5, \rho_2k_2)$
- Player 1 will thus offer $s^*_1 = \min(0.5, \rho_2k_2)$ so they can maximize their payoff
- This shows that the threat of punishment by a norm-following player can induce sharing from Homo Economicus
- If $A_2 = \text{Accept if } s_1 > 0.5$, then they will accept if $s_1 > \min(0.5, \rho_2k_2)$
- In the Sequential PD game, cooperation can still be sustained with player 1 being Homo Economicus
- With norm-following preferences and an internal community reference, player 2 will likely conditionally cooperate, forcing player 1 to cooperate due to the unique SPE being $(C, CD)$
- With an external community reference, Homo Economicus will defect as player 1 because player 2 becomes an unconditional cooperator
- In general, conditional cooperation is better for cooperation
- In the Linear Public Good Game, the socially efficient outcome of donating all income to the public good can only be achieved when norm saliences are large
- Using an internal community reference, the norm salience needed for cooperation decreases as compliance increases
- This turns the game into a coordination game where the equilibria are to donate everything or donate nothing
- With large groups of people, people have heterogenous preferences, and one person’s defection can undermine the group’s cooperation
- Increasing the size of the group increases the required norm salience
General Takeaways
- The community reference (internal or external) depends on the setting
- Norms can prescribe costly rewards and punishments
- Internalized norms can make cooperation an equilibrium in a one-shot game
- Normist preferences can create a coordination incentive
- The exact collection of preferences can have have a large effect on the equilibrium (i.e. two Homo Normist vs. one Homo Normist and one Homo Economicus)
- Norms of conditional cooperation are especially effective
- Acceptable actions do not have to be NE in the base material game
Repeated Homo Normist Games
- Finitely-repeated PD gives an example of history-dependent norms; if norm salience is high enough, Normists can achieve a $(C,C)$ SPE
- One Homo Economicus can undermine cooperation, shows damage of selfish types
- Tit-for-Tat players can induce HE to cooperate due to TFT’s reciprocity, leading HE to play $C$ until the very end
- Highlights decreasing rates of cooperation
- Real-life agents will hold beliefs about other players, and these beliefs can evolve over time after they see other players’ actions
- Begins with a prior (belief) that updates; player $i$’s belief at time $t$ is notated as $\beta_{it}$ and is a probability distribution over other players’ strategies
- Updating and prior belief is heterogenous
- Myopic updating assumes that whatever players played in the previous period, they will play in this in this period; short-sighted
Unit 4
Evolutionary Game Theory
- Evolutionary game theory is used to study the emergence of behaviors; behaviors “reproduce” and propagate to later generations
- Assumptions
- The population $n$ is large
- The strategy set is inherited or learned (from a parent)
- Payoffs represent reproductive fitness which is the ability to reproduce and have more children
- Players have same payoff/fitness functions
- The distribution of strategies is denoted as a population-level mixed strategy
- $S = (s_1, s_2, \ldots, s_m)$, and $\sigma_{it}$ is the proportion of individuals that play strategy $s_i$ at time $t$; $\sum_i\sum_{it} = 1$
- Players can be matched into small groups (PD) or with the entire population (Public Good game)
- Expected fitness: $u(s_i, \sigma_t) = \sum_j \sigma_{jt}u(s_i, s_j)$
- Types of populations
- Monomorphic: All individuals have the same trait (and play the same strategy)
- Polymporphic: Individuals have different traits, and there are multiple strategies
-
Evolutionary Stable Strategy (ESS): A strategy $\sigma^*$ that two various requirements
- Intuition: An ESS is a NE that prevents mutant strategies from invading since the mutant does worse against itself
- All strict NE are ESS, but weak and/or mixed NE might not be an ESS
- The PD game has one strict NE, so defecting is the ESS
- The Assurance Game has two pure NE (hunt stag or hunt hare), and both are ESS, but the mixed NE is not an ESS since it violates condition 2
- If enough players mutate at once, then they can change the ESS
-
Replicator Dynamic (RD): A model that represents the dynamics of a population; individuals “replicate” themselves at the end of each period based on the payoff they received
- Let $n_{it}$ be the number of replicators with type (strategy) $s_i$; the total population is $N_t = \sum n_{it}$, and the proportion of type $s_i$ is $x_{it}$, with $x_t$ being the distribution of types in the population
- Reproduction: $n_{it+1} = u(s_i, x_t)n_{it}$, $x_{it+1} = \frac{u(s_i, x_t)}{\bar{u}(x_t)}x_{it}$; proportion of a type depends on its average fitness compared to the population’s average fitness
- Base model has selection, but not mutation
-
Fixed Point: A point in a RD where the system does not change (and will thus never change again); $x_{it+1} = x_{it}$
- Monomorphic distributions are fixed points
- A fixed point is asymptotically stable if, after any movement away from the point, the system moves back to the point
- Unstable: system moves farther away after shock; Stable: system moves partly back after shock
-
Evolutionary Equilibrium (EE): A fixed point that is asymptotically stable under RD
- Considers a wider range of small invasions as opposed to one single invader
Case Study: PD and Tit-For-Tat
- Environment: A finitely-repeated PD game with RD
- With two strategies (all defect or all cooperate), unique NE is to defect in all periods, so defecting in all is ESS and EE
- Adding TFT into the strategy set adds another NE
- Not an ESS since it fails the second condition, but is an EE due to its asymptotic stability
- TFT does well against itself, doesn’t get punished by all defect, and can cooperate with all cooperate, making it stable
- Phase diagram with 3 players can be represented with a triangle; corners are monomorphic fixed points, edge points have two types, interior points have three types
- Edges will either go towards a fixed point, go towards two fixed points, or consist entirely of fixed points
- If going towards both corners (or inwards), then there exists a fixed point somewhere in the middle of the edge; the All D and TFT edge has a fixed point at $d=\frac{1}{2}$
- Interior points can be analyzed by seeing which strategy grows faster and when
- All D is a dominant strategy in most initial distributions, as TFT can only dominate if there are few All D types and lots of All C types
- Edges will either go towards a fixed point, go towards two fixed points, or consist entirely of fixed points
- Since there are an infinite number of strategies, we should include any that can be an EE and not include any that are strictly dominated
- Weakly dominated strategies should be included if plausible and interesting
- Cognitively advanced: A strategy that is able to analyze what type they are playing against and update their actions accordingly
- In infinitely repeated PD, cognitively advanced types get fitter due to their ability to cooperate
- Compare this to the one-shot PD where cooperation cannot arise due to the lack of complex strategies
- Evolution favors the cognitively advanced, allowing for norm followers to evolve
Non-Uniform Matching
- The standard RD model uses uniform matching; probability of one type interacting with another depends on the proportion of the other type in the population
-
Non-uniform matching occurs more in real life; people will match more often with people who are close to them (family, neighbors, etc.)
- Positive assortative matching: Individuals of the same type are more likely to be matched; dating partners, college admissions, etc.
- Negative assortative matching: Individuals of the same type are less likely to be matched
- Either type of matching can be good
- Fixed matching: Matching due to circumstance that don’t involve choice, like family or location
- Non-uniform matching allows for more cooperation due to higher chance of cooperators meeting
- In a RD with PD, cooperation is more likely with (strong) positive assortative matching
- Setup: Chance of meeting a cooperator as a cooperator is higher than base chance; ${c^2N \rightarrow acN, c<a<1}$
- Individuals can avoid interacting with others, i.e. cooperator avoids defector and exits the interaction
- Setup: Add another strategy to PD, $A$, such that $u(D,D) < u(A,A) < u(C,C)$
- There are evolutionary pressures to choose a strategy of avoidance
- Fixed networks can be represented by circle networks (graph) where each player interacts only with their network neighbors
- Offspring are born in the same node
- Population is fixed and interactions is local; use Imitation Dynamic (ID) where the offspring mimics the type whose average within the neighborhood was the highest
- In a circle network that plays PD with clustered types, the D types will never switch since they are guaranteed to stay the same at the boundary; entire network won’t change if benefit from cooperating is high enough
- If no clumping, C types will get dominated by D types due to lack of cooperation
- With stronger clumping, C types can continue as C types with a lower benefit from cooperation, and D types can actually change
- Full conversion of D types is difficult because if D types are surrounded by C types, they will do very well
In-Group Favoritism
-
Prosocialty: Showing kindness to others
- Reciprocity is selective prosociality - be nice to people who are nice to you
- In-group favoritism: Act prosocially toward members of your social group and not prosociolly to those outside of the group
- Chen and Li (2009) generated groups based on which artist people liked; people in these artificial groups shared more money with people from their own group
- Grimm et al. (2017) found evidence that members in the Dictator Game will give more money to people in their own major
- Also showed that in-group favoritism is a type of social norm; people expect that others will show favoritism
- Adding a Grouper type to one-shot PD adds an ESS into the game
- Type of conditional cooperation: cooperate with other groupers
- All Groupers is the only EE due to its high fitness; shows that evolution favors in-group favoritism
- A Reciprocator type (cooperate with cooperators and defect with defectors) does worse than Grouper types, so in-group favoritism is better than general reprocity in evolution
- R is an EE, but not an ESS
- With two groups, G1 and G2 are both ESS and EE; society evolves to become a monomorphic society
- Highlights how evolution favors dominance by a single group, biases against diversity
- These findings provide an incentive to match with people in your group, leading to segregated societies based on groups
- Evolutionary selection can be tied to groups; more fit groups are more likely to be selected for natural selection
- Criteria can include competition, defensive ability, adaptability, conflict-resolution, child-rearing practices, transmitting norms, etc
- Groups can eliminate other groups to control resources, but this causes free-rider problems where individuals in a group does not want to exert effort to eliminate enemy groups
- When groups compete for resources, base models show that the share of defectors inside of a group will grow over time, leading to a decrease in overall cooperation over time
- Adding elimination into group models allows you to show that cooperation can increase over time under the RD
- Population of groups will converge to having a rate of cooperation equal to the rate of cooperation in the most cooperative group
- With random winners via a contest winner function and a randomly generated imperfect copy of the winning group, cooperation is expected to increase and shows that in-group cooperation is selected over defection
- Despite individual selection where the cooperation rate goes down from $c_g$ to $\delta c_g$, average cooperation can still increase if individual-level selection ($\delta$) is weaker than group-level selection (difference between $c_i$ and $c_j$)
- High frequency of conflict is one way to increase group-level selection
- Both individual-level and group-level selection can bolster the rate of cooperation
- Inter-group cooperation is possible using a trade-or-fight model; one would trade when destruction is high and fighting prize is low (optional: returns from trade are high), and one would fight when the opposite is true
- Technological change, specialization, and more deadly weapons can lead to inter-group cooperation due to high returns to trade and higher chances of destruction