Generous Tit For Tat: A Winning Strategy - Forbes

graceacupuncture - 28/04/2022 - STRATEGY - 716 Views

For those who need it, a short refresher on the prisoner’s dilemma:Two guys are caught just as they are about to rob a bank and separated immediately.They didn’t know each other well before they were picked up by the police, and the cops are trying to sweat them, now that they can’t communicate.The detectives say to each, separately, look, if you give him up, you, as a helpful state’s witness, can go free, and he’ll get 10 years for being the mastermind.If he rats you out, you’ll get the 10 years, and he’ll go free.If you implicate each other, you’ll both be equally guilty and get five years apiece.If neither of you rats the other out, then we’ll nail you both on “loitering,” but you’ll only get six months.

So, you have four possible outcomes: 10 years, five years, six months, and walk away.But two of those will be foreclosed by whatever choice your former partner makes.If he chooses the “screw” strategy, you’ll either get 10 years or five years.If he chooses the “good faith” strategy, you’ll either walk or get six months.The problem is, if you’re trusting and get screwed, you have the worst outcome, coupled with the worst “differential” (i.e., he gets the best outcome).

The prisoner’s dilemma plays out every day in choices that we make.The Radiolab segment highlights how President Kennedy had a similar set of choices when dealing with Nikita Khrushchev with respect to the nuclear arms race.Build weapons because they may be doing so?Don’t build weapons and hope they’ll follow?What if we do?What if we don’t?In the end, the two sides chose, narrowly, to cooperate, and humanity squeaked by.

Divorcing couples sometimes scorch the earth for each other out of spite, even though both are worse off for having made the screw choice.

A version of the prisoner’s dilemma was run as a simulation at University of Chicago when I was at the business school there in 1976.The test was administered to both undergrads and business school students.And wouldn’t you know, when undergrads were matched against each other, it was mostly Kumbaya, and when the business students played each other, it was generally Armageddon.Natch, when the business students were set upon the undergrads, the young idealists ended up doing the hard time while the cynics skated away.

The unifying theme of the Radiolab piece is altruism, and why anyone is altruistic when the consequences can be so deleterious.The segment goes all over the place, visiting Richard Dawkins, the famous atheistic biologist (or biological atheist, if you prefer), and Charles Darwin, the originator of the theory of natural selection.But the part that interests me is toward the end, where math, simulation, and computer programming come into it.

In 1960, or thereabouts, Robert Axelrod, now a professor of political science at the University of Michigan, but then a high school student with access to the only computer at Northwestern University, was doing simulations of hypothetical life forms as part of a science project.In 1962, the time of the Cuban Missile Crisis, he set up a program to run the prisoner’s dilemma, except that instead of running it once, it was set to run many times, testing one strategy against another.

So, think of negative outcomes as if they were decrements in “health points” in a modern video game.When you lose, you lose health, but you don’t die on the first run.You and your opponent play your strategies over and over again, and over time, one or the other of you prevails.

Behind this inquiry was the highly relevant question: how can we get out of the arms race?We want to cooperate, but don’t want to get screwed.

Axelrod invited professors and other luminaries who had written theoretical papers on the prisoner’s dilemma to compete in a computer tournament.They would write a computer program that embodied their strategy, and it would be run against each of the others 200 times.Points were given and taken away, and the winner would receive a plaque.

Meet the Programs

Here are a few samples of the algorithms the professors sent.

Massive Retaliatory Strike — cooperate until the first screw, and then screw for the rest of the game.

Generous Tit For Tat: A Winning Strategy - Forbes

Tester — start by attacking; if retaliation is received, then back off and start cooperating for a while; then, throw in another attack; tester is designed to see how much it can get away with.

If Tester plays Massive Retaliatory Strike, both do badly.

Jesus — one line of code: always cooperate.

Lucifer — one line of code: always screw.

If Lucifer plays Jesus, evil prevails.

Although Axelrod thought that the winning program would likely to be tens of thousands of lines of code, that turned out not to be the case.The best strategy was Tit for Tat, with only two.

Tit for Tat — First line: be nice (never nasty first); 2nd line: do whatever the other guy did on the last move; Tit for tat retaliates only once, letting bygones be bygones.

Tit for tat can swing both ways.It elicits cooperation if you’ve got any inclination, but doesn’t take any guff.When playing against Jesus, a virtuous cycle of cooperation prevails for all 200 rounds.

Against Lucifer, Tit for Tat plays pretty good defense.

And it wins, in evolutionary terms.In the evolutionary version of the tournament, Axelrod allowed winning programs to reproduce copies of themselves according to how well they did.The simulation was run for many generations.

And the incredible thing is that (insert Don LaFontaine voice here) in a world where Lucifers dominate, a few Tit for Tat players can take back the night if there are enough of them to run into each other from time to time.

The one remaining problem with Tit for Tat is that a match that sets Tit for Tat against Lucifer — or even an Aggressive Tit for Tat with a first line that initiates an attack — will result in horrible carnage.The massive retaliation echoes and echoes until nothing is left but dust.

So, one little tweak to Tit for Tat optimizes the program.This is where we get Generous Tit for Tat.The second line of code (the one that says, do whatever the other guy did last time) gets modified to not always retaliate, but nearly always retaliate.Mathematically, that translates to not retaliating one in 10 times after sustaining an attack.This mod stops the echoes.

“For every nine parts Moses, you need one part Jesus,” as the Radiolab hosts put it.“This is a strategy that just seems to be woven into the fabric of the cosmos.It works for computers.It works for people.It probably works for amoebas.It just works.”

© 2011 Endpoint Technologies Associates, Inc.All rights reserved.

Twitter: RogerKay