Difference between revisions of "User:Flixx/Game Theory for Recruiting"
(→Test-results) |
|||
(13 intermediate revisions by the same user not shown) | |||
Line 4: | Line 4: | ||
This problem can be solved by finding a "Mixed Strategy Nash Equilibrium". | This problem can be solved by finding a "Mixed Strategy Nash Equilibrium". | ||
− | I already implemented a Proof of Concept. It gives | + | I already implemented a Proof of Concept. It gives acceptable test-results against the default AI and "Alarantalara's recruiter". |
==Test-results== | ==Test-results== | ||
Line 10: | Line 10: | ||
Map Side 1 Side 2 Games Side 1 won Side 2 won | Map Side 1 Side 2 Games Side 1 won Side 2 won | ||
− | + | ----------------------------------------------------------------------------------- | |
− | + | Weldyn Channel rusher default 1099 62.51 % 31.67 % | |
− | + | Weldyn Channel flix default 1634 63.89 % 32.31 % | |
− | + | Weldyn Channel default flix 1583 22.11 % 72.71 % | |
− | + | Weldyn Channel flix rusher 821 41.41 % 54.57 % | |
+ | Weldyn Channel rusher flix 810 38.52 % 56.42 % | ||
+ | The Freelands flix default 1004 66.73 % 30.68 % | ||
+ | The Freelands default flix 1000 27.00 % 71.60 % | ||
+ | The Freelands flix rusher 1422 47.82 % 50.28 % | ||
+ | The Freelands rusher flix 1426 41.09 % 57.36 % | ||
+ | The Freelands rusher default 1156 62.02 % 36.59 % | ||
+ | The Freelands default default 1654 47.40 % 48.97 % | ||
==Description== | ==Description== | ||
− | The Algorithm will find the ''theoretical best mix to recruit for maximizing own payoffs by only knowing the recruitment-list of the enemy''. It is meant to be a part of the overall solution and can later | + | The Algorithm will find the ''theoretical best mix to recruit for maximizing own payoffs by only knowing the recruitment-list of the enemy''. It is meant to be a part of the overall solution and can later be weighted against the other algorithms. |
Each Unit Combination will get a Effectiveness Score (e.g. unit A is 200 effective against Unit B. This implies that unit B is -200 effective against Unit A) | Each Unit Combination will get a Effectiveness Score (e.g. unit A is 200 effective against Unit B. This implies that unit B is -200 effective against Unit A) | ||
Line 32: | Line 39: | ||
B -30 90 | B -30 90 | ||
− | The best thing to do for P1 is here to recruit A and B in proportion 3:1. Because 3 * 40 - 30 = 3 * 0 + 90 = 90 | + | The best thing to do for P1 is here to recruit A and B in proportion 3:1. Because |
+ | |||
+ | 3 * 40 - 1 * 30 = 90 | ||
+ | 3 * 0 + 1 * 90 = 90 | ||
(Then Unit X and unit Y are both -90 effective in the battlefield) | (Then Unit X and unit Y are both -90 effective in the battlefield) | ||
− | Game Theory gives us nice tools to find best mixes and *the optimal strategy*. | + | Game Theory gives us nice tools to find best mixes and *the optimal strategy* aka Nash Equilibrium. |
− | Note that this whole thing is heavily influenced by the algorithm that gives us those Effectivness Scores. I use a slightly | + | Note that this whole thing is heavily influenced by the algorithm that gives us those Effectivness Scores. I use a slightly improved version of |
int recruitment_phase::average_resistance_against(const unit_type& a, const unit_type& b). | int recruitment_phase::average_resistance_against(const unit_type& a, const unit_type& b). | ||
Line 82: | Line 92: | ||
I will now try to explain the concept of a Nash Equilibrium. | I will now try to explain the concept of a Nash Equilibrium. | ||
− | (Game Theory is cool stuff, take a course | + | (Game Theory is cool stuff, take a course if you can ;) ) |
==Game Theory and Nash Equilibrium== | ==Game Theory and Nash Equilibrium== | ||
Line 159: | Line 169: | ||
The Algorithm how to find this mixed Strategy Nash Equilibrium was described by J.D.Williams and can be found in [http://www.rand.org/content/dam/rand/pubs/commercial_books/2007/RAND_CB113-1.pdf The Compleat Strategyst] from 1986 on on Site 219. (It can be read like a cookbook - I implemented it exactly like described.) | The Algorithm how to find this mixed Strategy Nash Equilibrium was described by J.D.Williams and can be found in [http://www.rand.org/content/dam/rand/pubs/commercial_books/2007/RAND_CB113-1.pdf The Compleat Strategyst] from 1986 on on Site 219. (It can be read like a cookbook - I implemented it exactly like described.) | ||
+ | |||
+ | For the Code see [https://github.com/flixx/wesnoth-old/tree/game_theory/src/ai/flix my GitHub Fork] |
Latest revision as of 13:55, 18 May 2013
(Part of a GSoC Proposal 2013)
This site is an explanation how the Recruitment Phase could make use of Game Theory. This can be especially useful in context of a Counter-Counter Strategies. This is the question "What units should Player 1 in the first turn recruit so that Player 2 cannot counter those units easily".
This problem can be solved by finding a "Mixed Strategy Nash Equilibrium". I already implemented a Proof of Concept. It gives acceptable test-results against the default AI and "Alarantalara's recruiter".
Contents
Test-results
For my Test's I modified the maps I used (I enlarged the castles to 10 free hexes and removed all villages) and set the start-gold to 200 to have a perfect "Counter-Counter situation". Side 1 always starts.
Map Side 1 Side 2 Games Side 1 won Side 2 won ----------------------------------------------------------------------------------- Weldyn Channel rusher default 1099 62.51 % 31.67 % Weldyn Channel flix default 1634 63.89 % 32.31 % Weldyn Channel default flix 1583 22.11 % 72.71 % Weldyn Channel flix rusher 821 41.41 % 54.57 % Weldyn Channel rusher flix 810 38.52 % 56.42 % The Freelands flix default 1004 66.73 % 30.68 % The Freelands default flix 1000 27.00 % 71.60 % The Freelands flix rusher 1422 47.82 % 50.28 % The Freelands rusher flix 1426 41.09 % 57.36 % The Freelands rusher default 1156 62.02 % 36.59 % The Freelands default default 1654 47.40 % 48.97 %
Description
The Algorithm will find the theoretical best mix to recruit for maximizing own payoffs by only knowing the recruitment-list of the enemy. It is meant to be a part of the overall solution and can later be weighted against the other algorithms.
Each Unit Combination will get a Effectiveness Score (e.g. unit A is 200 effective against Unit B. This implies that unit B is -200 effective against Unit A)
After P1 recruited a mix one could sum those Effectiveness Scores up and then P2 could say about own units something like "My unit X is now 1400 effective in the battlefield, unit Y is -200 effective, ...").
We want to achieve that those summed scores are equal.
Example:
P1/P2 X Y A 40 0 B -30 90
The best thing to do for P1 is here to recruit A and B in proportion 3:1. Because
3 * 40 - 1 * 30 = 90 3 * 0 + 1 * 90 = 90
(Then Unit X and unit Y are both -90 effective in the battlefield)
Game Theory gives us nice tools to find best mixes and *the optimal strategy* aka Nash Equilibrium.
Note that this whole thing is heavily influenced by the algorithm that gives us those Effectivness Scores. I use a slightly improved version of
int recruitment_phase::average_resistance_against(const unit_type& a, const unit_type& b).
I tested the different versions (old average_resistance_against against my average_resistance_against):
Map Side 1 Side 2 Games Side 1 won Side 2 won Weldyn_Channel flix (improved) flix (old) 449 43.43 % 50.11 % Weldyn_Channel flix (old) flix (improved) 465 36.34 % 53.55 %
For now I altered the way the different attacks are weighted against each other and included a weight for the cost of a unit. But I'm sure that this function can be improved further.
(I will do it in context of improving Combat Analyses. And refactor it, the variable names are kind of confusing...)
Often it is not possible and even unnecessary to make a equal mix. See the following example:
P1/P2 X Y A 100 -5 B -10 -10
So obviously P1 should only recruit Unit A, even if P2 can "easily counter" with unit Y. But recruiting a mix would be not the best thing to do here. So in my algorithm it will sometimes happen that only one unit will be recruited. This is maybe not something one would expect for a good Counter-Counter strategy. But if we assume that those effectiveness-scores are accurate then recruiting only one unit is sometimes the theoretical best thing to do.
This is how a full table will look like (example output of my algorithm).
P1/P2 |Elvish Archer|Elvish Fighter|Elvish Scout|Elvish Shaman| Mage |Merman Hunter|Wose Dwarvish Fighter | 88.75 | 31.32 | 140 | 216.2 |132.2 | 156 | 24.91 Dwarvish Guardsman | 38.38 | 0.5865 | 102.8 | 124.9 | 44.21 | 81.05 | -70.89 Dwarvish Thunderer | 24.65 | -31.76 | 122.7 | 158.4 | 52.51 | 85.95 | -83.43 Dwarvish Ulfserker | -28.75 | -84 | 41.24 | 109.6 | 1.92 | 27.15 | -78.8 Footpad | -48.32 | -118.4 | 12.94 | 84.77 |-78.36 | 2.714 |-135 Gryphon Rider | 9.75 | -46.45 | 73.88 | 133.3 | 17.18 | 69.67 | -83.28 Poacher | 6.681 | -58.5 | 121.2 | 164.2 | 36.99 | 75.7 |-156.6 Thief | -54.84 | -143 | 21.39 | 112.2 |-90.87 | 10 |-112.2
And acctually according to this table P1 have to assume that P2 will always recruit a Wose because a Wose would be the best choice against all units. And then we should decide only to recruit Dwarvish Fighter (because they are best against Woses). One could now say that P1 could do better when P2 will recruit not only Woses. This is definitely true.
The important thing is that P1's worst case is minimized. The worst case is still that P2 will only recruit Woses. If P1 would recruit other units than Dwarvish Fighters the worst case would alway be ... well ... worse.
Note that this Game Theory algorithm I implemented for finding the Nash Equilibrium cannot be improved as such. The Game represented by a table like above just have its deterministic Nash Equilibrium. It's well defined. (Zero-Sum Games always have one or more Equilibrias with the same Value).
Note also that we cannot always recruit like we should according to the mixed strategy. For example if the strategy is 13.4% Unit A and 86.6% Unit B then most likely we'll have to round. Rounding is bad when dealing with Nash equilibria. See below for more thoughts on this.
I will now try to explain the concept of a Nash Equilibrium. (Game Theory is cool stuff, take a course if you can ;) )
Game Theory and Nash Equilibrium
I will provide 2 brief general examples to help understanding the concept of Nash Equilibrium and mixed strategies.
The whole thing is about 2 Players Playing a Game. Each Player have different options available and to each option combination there is a payoff for each player. The Players will play simultanously. So they don't know what the opponent have played. (Recruitment does't work like this because P2 can see the recruits of P1, but below is a explanation why this can be a good model for wesnoth)
Games can be shown in tables like this:
P1/P2 X Y Z A -3 -1 -2 B 10 -5 3 C 5 1 2
This is a example for a zero-sum game. (Usually there are two entries for each combination. One entry for the payoff of each player.)
For example when Player one Plays B and Player 2 plays Z, Player 1 will get a Payoff of 5 and Player 2 a payoff of -5.
The question is now what should player 1 play to maximize his payoff? He could play B and hope P2 will play X to get the 10. But most likely P2 will not play X because other option seem nicer for P2. The best thing P1 could do is to play C and go for the 1 Payoff.
The Stategy P1:C / P2:Y is called Nash Equilibrium.
One could say a Nash Equilibrium is achieved when after the Game both players can say that they did the best choice they could knowing the enemy's choice now. If P2 had known that P1 will play C then Y would still be the best choice. And if P1 had known that P2 will play Y then C would still be the best choice.
Now let's take a look at this example:
P1/P2 X Y A 1 -1 B -1 1
Now what? It seems like there is no Nash Equilibrium, because one player can always say "if I had known that my opponent plays this, I had done something different". The right thing to do is actually to randomize over the options. So P1 will play 50% A and 50% B. This is called a mixed strategy. This is a Nash Equilibrium as well because P2 could say "now that I know that P1 will randomize with 50% the best thing to do is to randomizing with 50% as well".
So while recruiting we try to find a mixed strategy.
Advanced stuff and some drawbacks
One could argue, that finding a Nash Equilibrium is not the appropriate way to find the best strategy, because P2 will know what we have recruited. (When finding a Nash Equilibrium it is usually assumed that both players choose a option without knowing the opponents choice) So one could argue that this is actually something whats called a perfect information Game. And those Games are usually solved by a minmax approach.
A minmax algorithm would indeed give us the best result but it would be expensive because one would have to iterate over all possible recruiting mixes.
But a Nash Equilibrium is actually a nice way to approach the perfect mix. Let's model a Game of wesnoth in a different perspective to make this clear: Each player can recruit let's say 100 units at the beginning (P2 will see the choices of P1). Now each player can send only one random unit at a time into the battlefield. The two units will fight and the players get some payoffs.
This is actually not a to bad model for wesnoth:
Assume P1 has a unit A and B, P2 has a unit X which is good against A and bad against B and a unit Y which is good against B and bad against A. Now what? The odds of the games will decide which units will fight against each other. So it should be not to bad when we assume that the units on the battlefield will be chosen at random for a fight.
So by recruiting we are not playing the Game itself but only telling the opponent with what probabilities he will encounter which unit in the battlefield. So one could say our recruited mix as the mixed strategy we've chosen. The enemy can know our strategy without being in a perfect information game.
Now a drawback: If we only have little money available (and can only recruit 1 or 2 units) than our strategy gets inaccurate. Look at this example
P1/P2 X Y A 50 0 B -3 2
The best strategy for P1 is to recruit A and B in proportion 1:10. But if we only have one unit to recruit (and not 11) one could think that B is a good choice. But then P2 could easily counter with X. This is a extrem example. But it shows us that if the Nash Equilibrium strategy can be bad when rounded.
So one could think about actually running minmax when we can only recruit 1 or 2 units.
For those who are interested
The Algorithm how to find this mixed Strategy Nash Equilibrium was described by J.D.Williams and can be found in The Compleat Strategyst from 1986 on on Site 219. (It can be read like a cookbook - I implemented it exactly like described.)
For the Code see my GitHub Fork