Where are these spikes coming from?
A quick review: I am simulating the evolution of strategies for playing Prisoner's Dilemma. I have a parameterized version of the classic strategy TIT FOR TAT. At each generation, I have 100 strategies, i.e. 100 different combinations of values of the four parameters that define a strategy. Each strategy is played against every other strategy for 100 or 300 iterations. Each strategy accumulates a total score across all these iterations and opponents. After all the play of one generation, the lowest scoring half the the strategies are eliminated. The remaining strategies, that scored in the top half, are then combined in a randomized way to generate enough new strategies to bring the total back to 100, and a new round of play is begun.
The graph above is a plot of the numbers of defect-defect, defect-cooperate, and cooperate-cooperate moves for each generation. For this experiment I let each pair of strategies play each other for a run of 300 iterations, instead of 100. The frequency of spikes is thereby reduced.
The most interesting parameter seems to be - given that the opponent has been defecting recently, with what probability should the current strategy cooperate? This probability appears to settle slowly down to around 0.05, then jump rapidly at irregular intervals to around 0.2.
I ran a strategy with this "unwarranted cooperation" parameter set to 0.1, against the classical TIT FOR TAT, for 100 runs, each of which was 100 iterations long. TIT FOR TAT never won a run of 100 iterations - at best it tied. The 20th through 80th percentile runs had the 0.1 strategy winning by 5 points. The best the 0.1 strategy did against TIT FOR TAT was to win by 15 points.
I then played a 0.2 strategy against the 0.1 strategy. the 0.1 strategy generally won - the best it did was to win by 110 points. The median margin was 45 points. 80 percent of the time it won by at least 25 point. But the 0.2 strategy did win at least once, by 25 points.
I think this explains the spikes, or at least starts to. The results of the playing at each generation are not well determined - the distribution of results has some significant range. There is a general drift toward defection, but now and again there is enough lucky cooperation that the cooperating strategies can survive and reproduce.
Another source of randomness is the creation of the new strategies to be added to play in the next generation. Each new strategy is constructed from two randomly chosen surviving strategies from the current generation. If by chance two relatively cooperative strategies are chosen, their child is sure also to be cooperative. The more strategies there are that are cooperative, the more they have the opportunity to increase their scores through mutual cooperation.
No comments:
Post a Comment