Difference between revisions of "Machine Learning Recruiter"

From The Battle for Wesnoth Wiki
(Winning percentages for Recommended ML Recruiter)
m
 
(107 intermediate revisions by 2 users not shown)
Line 1: Line 1:
This page documents the new machine learning recruiter submitted as a patch for Wesnoth 1.11.  We describe how to run it, discuss experiment showing that the ML Recruiter achieves dramatically better performance than the RCA AI recruiter, describe known issues and suggest a development road map.
+
This page documents the new machine learning recruiter submitted as a patch for Wesnoth 1.11.0.  We describe how to run it, discuss experiments showing that the ML Recruiter achieves dramatically better performance than the RCA AI recruiter, describe known issues and suggest a development road map.
  
 
Note that the ML Recruiter is a work in progress.  We welcome feedback on it.  Please discuss it on the thread "Machine Learning Recruiter" at http://forums.wesnoth.org/viewtopic.php?f=10&t=36642.
 
Note that the ML Recruiter is a work in progress.  We welcome feedback on it.  Please discuss it on the thread "Machine Learning Recruiter" at http://forums.wesnoth.org/viewtopic.php?f=10&t=36642.
  
=Why include ML Recruiter in Wesnoth?=
+
==Why include ML Recruiter in Wesnoth?==
  
 
The ML Recruiter makes use of a small subset of the [http://waffles.sourceforge.net/ Waffles] Machine Learning toolkit adding 13 pairs of .cpp/.h files to Wesnoth.  In addition, the neural nets used by ML Recruiter are serialized as .json files, which is a format Wesnoth has not yet contained.  So why is this patch worthwhile?
 
The ML Recruiter makes use of a small subset of the [http://waffles.sourceforge.net/ Waffles] Machine Learning toolkit adding 13 pairs of .cpp/.h files to Wesnoth.  In addition, the neural nets used by ML Recruiter are serialized as .json files, which is a format Wesnoth has not yet contained.  So why is this patch worthwhile?
==Why the ML Recruiter will be great for Wesnoth==
+
===Why the ML Recruiter will be great for Wesnoth===
# Performance is great.  ML Recruiter defeats RCA AI 70 - 71% of the time.  Although we don't have data, this should translate into better performance against humans
+
# Performance is great.  ML Recruiter defeats RCA AI 66-73% of the time.  We suspect that this would also translate into better performance against human opponents because it also performs better against the "ron" recruiter included in [http://forums.wesnoth.org/viewtopic.php?f=10&t=34976 AI-Demos], which is a more challenging opponent.
 
# This superior performance is achieved with comparable "fun factor"  
 
# This superior performance is achieved with comparable "fun factor"  
 
#* Variety of units recruited by recommended ML Recruiter is comparable to or better than RCA AI
 
#* Variety of units recruited by recommended ML Recruiter is comparable to or better than RCA AI
Line 18: Line 18:
 
# Easy way to adjust campaign difficulty:  Adjusting ML AI for more/less randomness makes is easier/harder to defeat
 
# Easy way to adjust campaign difficulty:  Adjusting ML AI for more/less randomness makes is easier/harder to defeat
 
# Inclusion of ML Recruiter could lead to greater publicity and more contributors to Wesnoth
 
# Inclusion of ML Recruiter could lead to greater publicity and more contributors to Wesnoth
## SeattleDad plans to submit this work as a scientific paper to a conference such as [http://geneura.ugr.es/cig2012/ Computational Intelligence in Games]
+
## SeattleDad plans to submit this work as a scientific paper to a conference such as [http://eldar.mathstat.uoguelph.ca/dashlock/CIG2013/ Computational Intelligence in Games]
 
## Others might later build on this work by, for instance, trying ML algorithms other than neural nets, adding new features, further generalizing the algorithm, etc.   
 
## Others might later build on this work by, for instance, trying ML algorithms other than neural nets, adding new features, further generalizing the algorithm, etc.   
 
## The machine learning infrastructure is not specific to recruiting and could be repurposed for, for instance, attack planning, weapon selection, and making "retreat and heal" vs. "attack" decisions
 
## The machine learning infrastructure is not specific to recruiting and could be repurposed for, for instance, attack planning, weapon selection, and making "retreat and heal" vs. "attack" decisions
Line 24: Line 24:
 
### Note that, having established the basic framework with this patch, future work on machine learning will be much easier
 
### Note that, having established the basic framework with this patch, future work on machine learning will be much easier
  
=Using ML Recruiter=
+
==Using ML Recruiter==
 +
===Applying the patch===
 +
* Get the latest version of the patch from https://gna.org/patch/?3479
 +
* Get the Wesnoth 1.11.0 source as per http://wesnoth.org
 +
* Apply the patch as follows:
 +
patch -p1  -i [path to patch file]
 +
* Compile Wesnoth using CMake, SCons, or XCode
  
==Playing against the ML Recruiter==
+
===Playing against the ML Recruiter===
 
# From the main menu, choose "Multiplayer"
 
# From the main menu, choose "Multiplayer"
 
# Choose "Local Game"
 
# Choose "Local Game"
 
# Pick a map and adjust settings as desired.  ML Recruiter has been trained with the default setting for village gold and support, but it should work fine on other settings
 
# Pick a map and adjust settings as desired.  ML Recruiter has been trained with the default setting for village gold and support, but it should work fine on other settings
 
## Hit Okay
 
## Hit Okay
# For one side, Choose Player/Type-->Computer Player and then either ML AI (Recommended) or ML AI (Pure)
+
# For one side, Choose Player/Type-->Computer Player and then either ML AI (Recommended) or ML AI (Less Variety, probably stronger)
## For the opponent, either play against it yourself (pick your name) or watch it play the default AI (Computer Player-->RCA AI)
+
## For the opponent, either play against it yourself (pick your name), watch it play the default AI (Computer Player-->RCA AI), or watch it play itself (pick ML AI again)
  
==Testing the ML Recruiter in batch mode==
+
===Watching the ML Recruiter play a single game in nogui mode===
 +
The following command is convenient for watching the ML Recruiter play a single game in nogui mode, which allows you to quickly and easily see the ML Recruiter's decision-making process.  In this example, we would be running the ML AI (Recommended mode) for the Knalgan Alliance, while the default AI would be playing the Rebels.  Note that when run this way (with --log-info=ai/testing,ai/ml), a lot of logging messages will be printed to the console which will describe how the ML Recruiter is analyzing its options.
 +
  Wesnoth --log-info=ai/testing,ai/ml --nogui --multiplayer --controller 1:ai --controller 2:ai --parm 1:gold:100 --parm 2:gold:100 --parm 1:village_gold:2 --parm 2:village_gold:2 --scenario multiplayer_Weldyn_Channel --parm 1:gold:100 --parm 2:gold:100 --ai-config 1:ai/ais/ml_ai.cfg --ai-config 2:ai/dev/default_ai_with_recruit_log.cfg  --side 1:"Knalgan Alliance"  --side 2:Rebels
  
Testing in batch mode is easy.  After applying the ML Recruiter patch, copy utils/ai_test/ai_test2.cfg to the directory in which you want to run the experiment.  Then edit the first line of the .cfg file, "path_to_wesnoth_binary" to point to your Wesnoth executable.  Then adjust faction1 and faction2 to point to the factions you want to experiment with and point ai_config1 at the ML configuration file you want to try out.  Finally, to make everything easier, add the following to your path:
+
===Testing the ML Recruiter in batch mode===
 +
 
 +
Testing in batch mode is easy with the new version of ai_test2.py included in the patch.  After applying the ML Recruiter patch, copy utils/ai_test/ai_test2.cfg to the directory in which you want to run the experiment.  Then edit the first line of the .cfg file, "path_to_wesnoth_binary" to point to your Wesnoth executable.  Then adjust faction1 and faction2 to point to the factions you want to experiment with and point ai_config1 at the ML configuration file you want to try out.  Finally, to make everything easier, add the following to your path:
 
  [Wesnoth_Install]/utils/ai_test/
 
  [Wesnoth_Install]/utils/ai_test/
 
Now you can test Wesnoth in batch as follows:
 
Now you can test Wesnoth in batch as follows:
 
  ai_test2.py ai_test2.cfg
 
  ai_test2.py ai_test2.cfg
  
=Experiments:  ML Recruiter vs. RCA AI Recruiter=
+
==Experimental Results==
 +
 
 +
===Win percentages for different AI Pairs===
 +
{| border="1" cellpadding="20" cellspacing="0"
 +
!AI1
 +
!AI2
 +
!Games
 +
!Win % for AI1
 +
!Comment
 +
|-
 +
|ML Recruiter 0.3 (Less Variety)
 +
|RCA AI
 +
|1179
 +
|69.3%
 +
|"Less variety/probably stronger version".  Wins 73.4% of the time on the maps version 0.2 was trained on
 +
|-
 +
|ML Recruiter 0.3 (Recommended)
 +
|RCA AI
 +
|2363
 +
|66.6%
 +
|"Recommended version".  Same version as above, but has more randomness in its choice of units.  See [http://wiki.wesnoth.org/Machine_Learning_Recruiter#The_Weighted_Random_.28Recommended.29_Recruiter documentation on weighted random recruiter]
 +
|-
 +
|ML Recruiter 0.3
 +
|ML Recruiter 0.2
 +
|1937
 +
|58.0%
 +
|We've made a lot of progress since version 0.2
 +
|-
 +
|ML Recruiter 0.3
 +
|Ron Recruiter 0.11.4
 +
|1186
 +
|54.1%
 +
|Ron Recruit is the recruiter build into [http://forums.wesnoth.org/viewtopic.php?f=10&t=34976&start=405 AI Demos]
 +
|-
 +
|ML Recruiter 0.3 (Recommended)
 +
|ML Recruiter 0.3 (Less variety)
 +
|1186
 +
|49.6%
 +
|Difference is not statistically significant, so we pick the variant with more variety as the "Recommended" version. Note, though, that the "Less variety" version does a bit better against the RCA AI as you can see above.
 +
|-
 +
|RCA AI
 +
|RCA AI
 +
|
 +
|50%
 +
|Any AI against itself will win 50% of the time
 +
|-
 +
|RCA AI
 +
|Random
 +
|3,000
 +
|52.7%
 +
|Interesting result.  You would expect a completely random choice to get beat by a wider margin
 +
|}
 +
 
 +
===Faction vs. faction win % for ML Recruiter 0.3 vs. RCA AI===
 +
 
 +
These results are for the "Recommended" version
 +
 
 +
<code>
 +
all/data/138 $ analyze_log.py *.log
 +
 +
Overall Stats
 +
AI                            Wins Win %
 +
default_ai_with_recruit_log  789 33.4%
 +
ml_ai                        1574 66.6%
 +
Totals:                      2363
 +
 +
                                        Wins    Loss    Win %
 +
Drakes vs Undead                33 39 45.8%
 +
Drakes vs Northerners            20 39 33.9%
 +
Drakes vs Loyalists              66 11 85.7%
 +
Drakes vs Knalgan Alliance      38 40 48.7%
 +
Drakes vs Drakes                52 19 73.2%
 +
Drakes vs Rebels                46 2 95.8%
 +
Total Drakes                    255 150 63.0%
 +
 +
Knalgan Alliance vs Undead      48 19 71.6%
 +
Knalgan Alliance vs Northerners  23 48 32.4%
 +
Knalgan Alliance vs Loyalists    40 14 74.1%
 +
Knalgan Alliance vs Knalgan Alliance 36 23 61.0%
 +
Knalgan Alliance vs Drakes      34 26 56.7%
 +
Knalgan Alliance vs Rebels      46 21 68.7%
 +
Total Knalgan Alliance          227 151 60.1%
 +
 +
Loyalists vs Undead              32 41 43.8%
 +
Loyalists vs Northerners        17 48 26.2%
 +
Loyalists vs Loyalists          58 5 92.1%
 +
Loyalists vs Knalgan Alliance    53 19 73.6%
 +
Loyalists vs Drakes              52 10 83.9%
 +
Loyalists vs Rebels              30 29 50.8%
 +
Total Loyalists                  242 152 61.4%
 +
 +
Northerners vs Undead            59 6 90.8%
 +
Northerners vs Northerners      49 21 70.0%
 +
Northerners vs Loyalists        51 9 85.0%
 +
Northerners vs Knalgan Alliance  60 7 89.6%
 +
Northerners vs Drakes            56 14 80.0%
 +
Northerners vs Rebels            55 5 91.7%
 +
Total Northerners                330 62 84.2%
 +
 +
Rebels vs Undead                42 14 75.0%
 +
Rebels vs Rebels                51 15 77.3%
 +
Rebels vs Loyalists              71 8 89.9%
 +
Rebels vs Knalgan Alliance      50 15 76.9%
 +
Rebels vs Drakes                44 22 66.7%
 +
Rebels vs Northerners            21 40 34.4%
 +
Total Rebels                    279 114 71.0%
 +
 +
Undead vs Undead                41 37 52.6%
 +
Undead vs Northerners            25 55 31.2%
 +
Undead vs Loyalists              51 18 73.9%
 +
Undead vs Knalgan Alliance      57 6 90.5%
 +
Undead vs Drakes                41 9 82.0%
 +
Undead vs Rebels                26 35 42.6%
 +
Total Undead                    241 160 60.1%
 +
 
 +
</code>
 +
 
 +
===Units recruited by Recommended ML Recruiter===
 +
 
 +
Results are for Recommended AI vs. RCA AI for ML Recruiter 0.3
 +
 
 +
<code>
 +
Grand Totals
 +
Drakes Recruitment            Count %
 +
Drake Burner                  2925 28.3%
 +
Drake Clasher                909 8.8%
 +
Drake Fighter                2159 20.9%
 +
Drake Glider                  1156 11.2%
 +
Saurian Augur                2197 21.3%
 +
Saurian Skirmisher            985 9.5%
 +
Total:                        10331
 +
 +
Knalgan Alliance Recruitment  Count %
 +
Dwarvish Fighter              1535 14.4%
 +
Dwarvish Guardsman            543 5.1%
 +
Dwarvish Thunderer            4171 39.0%
 +
Dwarvish Ulfserker            767 7.2%
 +
Footpad                      1573 14.7%
 +
Gryphon Rider                847 7.9%
 +
Poacher                      677 6.3%
 +
Thief                        576 5.4%
 +
Total:                        10689
 +
 +
Loyalists Recruitment        Count %
 +
Bowman                        1799 15.6%
 +
Cavalryman                    551 4.8%
 +
Fencer                        345 3.0%
 +
Heavy Infantryman            1109 9.6%
 +
Horseman                      930 8.0%
 +
Mage                          934 8.1%
 +
Merman Fighter                852 7.4%
 +
Spearman                      5040 43.6%
 +
Total:                        11560
 +
 +
Northerners Recruitment      Count %
 +
Goblin Spearman              181 1.4%
 +
Naga Fighter                  504 3.9%
 +
Orcish Archer                2480 19.1%
 +
Orcish Assassin              1842 14.2%
 +
Orcish Grunt                  2524 19.5%
 +
Troll Whelp                  4833 37.3%
 +
Wolf Rider                    608 4.7%
 +
Total:                        12972
 +
 +
Rebels Recruitment            Count %
 +
Elvish Archer                1833 17.6%
 +
Elvish Fighter                3213 30.8%
 +
Elvish Scout                  1086 10.4%
 +
Elvish Shaman                222 2.1%
 +
Mage                          435 4.2%
 +
Merman Hunter                768 7.4%
 +
Wose                          2865 27.5%
 +
Total:                        10422
 +
 +
Undead Recruitment            Count %
 +
Dark Adept                    2460 20.7%
 +
Ghost                        827 7.0%
 +
Ghoul                        1474 12.4%
 +
Skeleton                      2386 20.1%
 +
Skeleton Archer              3772 31.7%
 +
Vampire Bat                  668 5.6%
 +
Walking Corpse                308 2.6%
 +
Total:                        11895
 +
</code>
 +
 
 +
===Units recruited by Recommended ML Recruiter vs Undead===
 +
As a breakdown of the above, it's interesting to look at the different unit blends that ML Recruiter 0.3 selects vs. the Undead as opposed to the overall totals shown above.  MLR's RCA AI opponent recruits a unit blend which consists of just the following four units:
 +
 
 +
RCA AI Recruitment for Undead:
 +
<code>
 +
Undead Recruitment            Count %
 +
Dark Adept                    3163 28.4%
 +
Ghost                        2451 22.0%
 +
Skeleton                      3574 32.1%
 +
Skeleton Archer              1941 17.4%
 +
Total:                        11129
 +
</code>
 +
<strong>ML Recruiter 0.3 units recruited against the RCA AI Undead.</strong>  Notice the large increase in the number of units with impact and fire attacks, which would be  effective against Skeletons and the decrease in Orcish Assassins and Ghouls, which are ineffective against every Undead unit except Dark Adepts.
 +
 
 +
<code>
 +
Results for enemy faction:Undead
 +
Drakes Recruitment            Count %
 +
Drake Burner                  666 37.8%
 +
Drake Clasher                37 2.1%
 +
Drake Fighter                478 27.1%
 +
Drake Glider                  339 19.2%
 +
Saurian Augur                91 5.2%
 +
Saurian Skirmisher            152 8.6%
 +
Total:                        1763
 +
 +
Knalgan Alliance Recruitment  Count %
 +
Dwarvish Fighter              352 17.2%
 +
Dwarvish Guardsman            31 1.5%
 +
Dwarvish Thunderer            259 12.7%
 +
Dwarvish Ulfserker            170 8.3%
 +
Footpad                      945 46.2%
 +
Gryphon Rider                153 7.5%
 +
Poacher                      66 3.2%
 +
Thief                        70 3.4%
 +
Total:                        2046
 +
 +
Loyalists Recruitment        Count %
 +
Bowman                        125 6.7%
 +
Cavalryman                    45 2.4%
 +
Fencer                        65 3.5%
 +
Heavy Infantryman            533 28.6%
 +
Horseman                      20 1.1%
 +
Mage                          538 28.8%
 +
Merman Fighter                103 5.5%
 +
Spearman                      437 23.4%
 +
Total:                        1866
 +
 +
Northerners Recruitment      Count %
 +
Goblin Spearman              24 1.1%
 +
Naga Fighter                  60 2.8%
 +
Orcish Archer                778 36.9%
 +
Orcish Assassin              43 2.0%
 +
Orcish Grunt                  202 9.6%
 +
Troll Whelp                  952 45.1%
 +
Wolf Rider                    51 2.4%
 +
Total:                        2110
 +
 +
Rebels Recruitment            Count %
 +
Elvish Archer                92 6.9%
 +
Elvish Fighter                265 19.9%
 +
Elvish Scout                  83 6.2%
 +
Elvish Shaman                50 3.8%
 +
Mage                          176 13.2%
 +
Merman Hunter                41 3.1%
 +
Wose                          625 46.9%
 +
Total:                        1332
 +
 +
Undead Recruitment            Count %
 +
Dark Adept                    548 23.1%
 +
Ghost                        56 2.4%
 +
Ghoul                        153 6.4%
 +
Skeleton                      777 32.7%
 +
Skeleton Archer              669 28.1%
 +
Vampire Bat                  80 3.4%
 +
Walking Corpse                94 4.0%
 +
Total:                        2377
 +
</code>
 +
 
 +
==How the ML Recruiter works==
 +
 
 +
When it's deciding what to recruit, the ML Recruiter works by predicting a "metric" which is a measure of how well a given unit will do in the game in the current situation.  A good measure of a unit's usefulness is a tricky question and we will discuss three different metrics below, but let's start with the easiest one, which is the sum of the following quantities for each unit:
 +
 
 +
# Experience points at the end of the game or when the unit is killed
 +
# Number of villages captured by the unit
 +
 
 +
Note that this metric is blind to other ways a unit can help you (in particular, it doesn't know about poison, healing, and slowing). 
 +
 
 +
This sum, which we'll call the "metric" is then divided by the unit cost to get metric/cost (think of this as goodness per unit of gold).  You can see this in the debugging output that the ML Recruiter prints to stderr when run with the flag --log-info=ai/testing,ai/ml:
 +
unit type              metric  cost    wt cost weighted metric
 +
Elvish Shaman          8.58    15      15.00  0.57
 +
Elvish Fighter          10.42  14      14.00  0.74
 +
Elvish Scout            8.47    18      18.00  0.47
 +
Wose                    17.07  20      20.00  0.85
 +
Mage                    10.37  20      20.00  0.52
 +
Elvish Archer          8.46    17      17.00  0.50
 +
Merman Hunter          8.55    15      15.00  0.57
 +
 
 +
This is from the first turn of a game between the Rebels and the Undead.  The ML Recruiter is predicting that if it recruits a Wose now, it will end with 17.07 XP + Village Captures.  17.07/20 = 0.85, which is the highest weighted metric at this time, so it picks a Wose as it's top choice. 
 +
 
 +
How does it know to pick a Wose?  It looks at the "features" which describe the current situation.  Here's another chart from the same game:
 +
 
 +
unit type              metric  cost    wt cost weighted metric
 +
Elvish Shaman          7.18    15      15.00  0.48
 +
Elvish Fighter          11.82  14      14.00  0.84
 +
Elvish Scout            7.91    18      18.00  0.44
 +
Wose                    15.53  20      20.00  0.78
 +
Mage                    9.11    20      20.00  0.46
 +
Elvish Archer          9.38    17      17.00  0.55
 +
Merman Hunter          8.36    15      15.00  0.56
 +
Side: 1 Gold: 21 Unit we want: Elvish Fighter
 +
PRERECRUIT:, enemy Dark Adept:1 , enemy Deathblade:1 , enemy Ghost:2 , enemy Skeleton:3 , enemy faction:Undead ,
 +
enemy gold:10 , enemy level3+:0 , enemy total-gold:139 , enemy unit-gold:129 , friendly Elvish Captain:1 ,
 +
friendly Elvish Fighter:1 , friendly Wose:4 , friendly faction:Rebels , friendly gold:21 , friendly level3+:0 ,
 +
friendly total-gold:161 , friendly unit-gold:140 , side:1 , terrain-forest:0.082 , terrain-mountain-hill:0.113 ,
 +
terrain-water-swamp:0.164 , total-gold-ratio:0.537 , turn:4 , village-control-margin:-2 , village-control-ratio:0.417 , village-enemy:7 ,
 +
village-friendly:5 , village-neutral:4 ,
 +
 
 +
The "features" that it sees are the values following "PRERECRUIT".  The ML AI sees that the enemy faction is the Undead and that they have one Deathblade, two ghosts, three Skeletons.  The Rebels currently have 4 Wose, 1 Elvish Fighter, and 1 Elvish Captain.  It also sees a number of other features like how much gold it and its opponent have, what percentage of each the map is covered by different terrain, and how many friendly, neutral, and enemy villages there are.  In this situation, although it still sees that the Wose is likely to score higher on the XP + village capture metric (15.5 vs. 11.8), this isn't enough to overcome the price differential, so it chooses an Elvish Fighter as it's best choice with a weighted metric of 0.84. 
 +
 
 +
Note that these predictions of 15.5 vs. 11.8 are computed by the neural net based on a model built from what the algorithm has seen has happened in similar situations during training.
  
==Chart showing relative winning percentages==
+
===Unit Goodness Metrics===
* RCA AI:  The Default AIWins 50% of the time against itself (of course)
+
We have experimented with three different unit goodness metricsAll of these metrics are designed to have the property that the higher the value of the metric, the better the unit performed in a a given gameClearly there is a random element hereIn some games when playing against a Skeleton-heavy Undead army, an Elvish Archer, which uses mainly a pierce attack, may get lucky and do better than a Wose, which has an impact attack, but on average the metric should show that the Wose performs better. 
* Random:  Recruiting units are chosen completely at random.  Wins 45.5% of the time overall
 
* Recommended ML Recruiter:  Units are chosen at random weighted by their relative value.  Wins 70% overall.
 
* Pure ML Recruiter: ML AI always chooses the unit it thinks is bestWins 71.5% of the time overall, but is might be seen as boring since it can produce armies which are overwhelmingly one or two units.
 
[http://imagebin.org/223121 Chart showing winning percentage of RCA AI, Random, Recommended, and pure ML Recruiters]
 
  
==Winning percentages for Recommended ML Recruiter==
+
The three metrics we've looked at are as follows:
  
<span style="font-family: Courier New; font-size: 19pt">
+
====Experience Point plus Village Capture====
$ analyze_log.py *.log
+
This is the metric used in ML Recruiter 0.2.  As noted above, it is the sum of the following quantities:
Overall Stats
 
Win % Wins AI
 
30.0% 1076 "default_ai_with_recruit_log
 
70.0% 2506 "ml_ai_faction
 
Total: 3582
 
                                        Win    Lose    Win %
 
Drakes vs Undead                37 43 46.2%
 
Drakes vs Northerners            44 75 37.0%
 
Drakes vs Loyalists              69 17 80.2%
 
Drakes vs Knalgan Alliance      52 54 49.1%
 
  Drakes vs Drakes                66 40 62.3%
 
Drakes vs Rebels                77 17 81.9%
 
Total Drakes                    345 246 58.4%
 
  
Knalgan Alliance vs Undead      74 19 79.6%
+
# Experience points at the end of the game or when the unit is killed
Knalgan Alliance vs Northerners  29 73 28.4%
+
# Number of villages captured by the unit
Knalgan Alliance vs Loyalists    93 16 85.3%
 
Knalgan Alliance vs Knalgan Alliance 67 37 64.4%
 
Knalgan Alliance vs Drakes      83 39 68.0%
 
Knalgan Alliance vs Rebels      75 18 80.6%
 
Total Knalgan Alliance          421 202 67.6%
 
  
Loyalists vs Undead              25 73 25.5%
+
This metric has the advantage that experience points lead to promotion, which is a very good thingAlso, getting kills should be correlated with how much damage the unit is doingAdding village captures to experience points is a little flaky, but is intended to give credit to fast units, which are more likely to capture villages.
Loyalists vs Northerners        24 67 26.4%
 
Loyalists vs Loyalists          73 20 78.5%
 
Loyalists vs Knalgan Alliance    57 50 53.3%
 
Loyalists vs Drakes              61 27 69.3%
 
  Loyalists vs Rebels              55 48 53.4%
 
  Total Loyalists                  295 285 50.9%
 
  
Northerners vs Undead            91 5 94.8%
+
====Victory====
Northerners vs Northerners      72 26 73.5%
 
Northerners vs Loyalists        91 2 97.8%
 
Northerners vs Knalgan Alliance  83 2 97.6%
 
Northerners vs Drakes            67 12 84.8%
 
Northerners vs Rebels            107 6 94.7%
 
Total Northerners                511 53 90.6%
 
  
Rebels vs Undead                81 19 81.0%
+
This metric gives a unit 1.0 if its side wins and 0 if its side loses. The effect is that the neural net's prediction for each unit can be seen as "what is the probability of victory if I recruit this unit in this situation"This is the most natural of all metrics, but experimentally it hasn't performed as well in terms of leading to actual victories as recruiting based on unit-based metrics. Performance has peaked at around a 59% win ratio for a victory metric vs. around 66 - 73% for the XP+VC metricWe think the problem is that the impact of recruiting a single unit of Type A vs. Type B on the victory probability is very small, so the neural net isn't differentiating among the choices enough.
  Rebels vs Northerners            34 64 34.7%
 
  Rebels vs Loyalists              97 6 94.2%
 
Rebels vs Knalgan Alliance      92 23 80.0%
 
Rebels vs Drakes                68 32 68.0%
 
  Rebels vs Rebels                84 14 85.7%
 
Total Rebels                    456 158 74.3%
 
  
Undead vs Undead                65 29 69.1%
+
====Gold Yield====
Undead vs Northerners            39 42 48.1%
 
Undead vs Loyalists              108 9 92.3%
 
Undead vs Knalgan Alliance      94 9 91.3%
 
Undead vs Drakes                90 19 82.6%
 
Undead vs Rebels                82 24 77.4%
 
Total Undead                    478 132 78.4%
 
</span>
 
  
==Units recruited by Recommended ML Recruiter==
+
As of ML Recruiter 0.3, this is the new default metric.  It is the sum of the following quantities, all of which are intended to quantify a unit's usefulness in terms of how much gold benefit it has yielded for the friendly side plus gold damage done to the enemy side.  This metric builds off of a [http://forums.wesnoth.org/viewtopic.php?f=10&t=36642&sid=11062936e4f4c0bab673470b5d211987&start=45#p536132 suggestion] from Sapient.
 +
# Basic Damage Metric: Target unit cost * (Damage inflicted/target max HP). The concept is that you cost your opponent this much gold by destroying this fraction of the unit. Obviously in any given attack, we would calculate this for both the attacker and the defender.
 +
# Village Capture: capturing_unit.variables.ml_gold_yield += wesnoth.game_config.village_income.  (Defaults to crediting 2 gold per village capture)
 +
#* The idea, again, is that fast units tend to get more captures than slow units and this gives units credit for being fast.
 +
# Poison: Treated the same as Basic Damage Metric by crediting for the amount of damage done in that turn. On the turn in which the unit is cured, the poisoner is credited with Target Unit Cost * (8/target max HP) to reflect the damage that it would have healed if it hadn't been poisoned (obviously, lessened if it has less than 8 HP of damage)
 +
# Slowing: When a unit is on defense and it slows the attacker, the defender gets no special credit because the attacker just unslows at the end of its turn. When you slow a unit as the attacker, the slowing unit gets credit for the Basic Damage Metric accumulated by the slowed unit until it unslows (the slowed unit would otherwise have done twice as much damage, so you get credit for the damage it didn't do)
 +
# Healing: healing_unit.variables.ml_gold_yield += Healed_unit_cost * (healed amount/healed unit max HP)
 +
#* Directly analogous to the Basic Damage Metric
 +
#* Note that healers also get credit for curing/stopping poison
 +
# Walking Corpse Creation: Credit a unit which gets a kill which creates a unit due to its plague ability with 8 gold (the value of a Walking Corpse).  (not implemented)
 +
# Leadership: Credit the leader for the bonus damage inflicted by the unit being led (not implemented)
 +
# Maintenance: Charge units for their share of the maintenance costs, weighted by level.  Hence, level 0 units never pay maintenance.  Level 2 units pay for twice as much maintenance.  (not implemented)
  
=How the ML Recruiter works=
+
===How MLR makes weighted random choices===
 +
The recommended recruiter is defined in ai/ais/ml_ai.cfg.  It is called "Recommended" in the user interface.  Although we are currently measuring it as performing roughly the same or slightly worse than the "Less variety/probably stronger" ML AI (ai/ais/ml_ai_less_random.cfg), we recommend it because it allows the player to see a greater variety of opposing units.
  
=Retraining the ML Recruiter=
+
The weighted random printout looks like the following:
=Known issues=
+
 
==Bugs==
+
Turn 13:
 +
unit type              metric  cost    metric/ weighted        %
 +
                                        cost    m/c            of total
 +
Merman Hunter          1.10    15      0.07    0.0000          0.0%
 +
Wose                    1.72    20      0.09    0.0000          0.1%
 +
Elvish Shaman          1.66    15      0.11    0.0000          0.4%
 +
Elvish Archer          2.71    17      0.16    0.0000          4.0%
 +
Elvish Fighter          2.54    14      0.18    0.0000          8.5%
 +
Mage                    4.18    20      0.21    0.0001          20.1%
 +
Elvish Scout            4.60    18      0.26    0.0003          66.8%
 +
Random Number chosen was        376
 +
Side: 1 Gold: 27 Unit we want: Elvish Scout
 +
PRERECRUIT:, enemy Dark Adept:2 , enemy Revenant:1 , enemy Skeleton:1 , enemy faction:Undead , enemy gold:8 ,
 +
enemy level3+:0 , enemy total-gold:83 , enemy unit-gold:75 , friendly Elder Wose:1 , friendly Elvish Fighter:2 ,
 +
friendly Elvish Ranger:1 , friendly Mage:1 , friendly Wose:3 , friendly faction:Rebels , friendly gold:27 ,
 +
friendly level3+:0 , friendly total-gold:225 , friendly unit-gold:198 , side:1 , terrain-forest:0.082 ,
 +
terrain-mountain-hill:0.113 , terrain-water-swamp:0.164 , total-gold-ratio:0.731 , turn:13 ,
 +
village-control-margin:0 , village-control-ratio:0.5 , village-enemy:8 , village-friendly:8 , village-neutral:0 ,
 +
 
 +
This situation occurs towards the end of a game that the Rebels are winning.  Note that total-gold-ratio (the ratio between the sum of gold + the value of all units on each side) is 0.731, so it's heavily in the Rebels' favor.  The ML AI sees an Elvish Scout as being the best choice in this situation with a Mage in second place.  The Elvish Scout is probably favored because the game is likely to be won rapidly and only a fast unit will be able to reach the enemy or reach a village fast enough to add to its "experience points + village capture" metric. 
 +
 
 +
The Weighted Random does the following:
 +
# It takes every metric/cost value and raises it to the sixth power.  Why?  We want to magnify the differences.  In this example 0.26/0.21 = 1.23, but (0.26**6)/(0.21**6) = 3.60. 
 +
# We then randomly choose a unit with a probability proportional to this weighted value which, in this case, was an Elvish Scout.
 +
 
 +
The Less Variety/Probably Stronger AI does the same thing, but raises metric/cost to the 24th power instead of the 6th power.  This still allows for some randomness, but weights the selection much more strongly towards the more favored units.
 +
 
 +
==How to train your own ML Recruiter==
 +
utils/ai_test/run_model_and_make_new_model.py is an end-to-end script for running a whole bunch of training games of Wesnoth and then training a new model based on the data output by that run.  Documentation on this script can be seen by running
 +
run_model_and_make_new_model.py --help
 +
Note that this script assumes that [http://waffles.sourceforge.net/ the Waffles machine learning toolkit] is installed and that waffles/bin/ is in your path.
 +
 
 +
==Known issues==
 +
===Bugs===
 
# Haven't added new Waffles files to Visual C++, so it won't compile under VC++.  I need some help with this.
 
# Haven't added new Waffles files to Visual C++, so it won't compile under VC++.  I need some help with this.
# ML Recruiter runs fine vs. RCA AI and vs. human, but can't run against itself (ML Recruiter vs. ML Recruiter).  This appears to be an issue with Lua in Wesnoth 1.11 and not with ML Recruiter
+
# The default for multiplayer games is that units require only 70% of normal experience to promote, however when a game is run from the command line, it always requires normal 100% of experience to promote. Consequently MLR doesn't see units promote as much as they should in training, which would slightly distort its training data.  This is a [https://gna.org/bugs/?19895 limitation of Wesnoth], not MLR.
  
==Current Limitations==
+
===Current Limitations===
 
# Only tested on two-player multiplayer games.  Doesn't work when there are more than two leaders on the map.   
 
# Only tested on two-player multiplayer games.  Doesn't work when there are more than two leaders on the map.   
# Works optimally on the following two-player maps (trained on these)
+
# Works on all two-player maps except for Hornshark Island, Thousand Stings, Caves of the Basilisk, and Dark Forecast
## Weldyn Channel
+
# As noted above in [http://wiki.wesnoth.org/Machine_Learning_Recruiter#Gold_Yield Gold Yield Metric], we account for all special abilities available in the main-line multiplayer scenarios except for plague, leadership, and unit maintenance costs
## The Freelands
+
 
## Den of Onis
+
==ML Recruiter development roadmap==
## Fallenstar Lake
+
 
# Tested on all other two-player maps and runs nearly as well except it crashes on the following:
+
* ML Recruiter 0.1:  Initial drop
#* Aethermaw, Hornshark Island, Dark Forecast, Sablestone Delta, Elensefar Courtyard, Silverhead Crossing
+
* ML Recruiter 0.1.1:  Minor retraining of the model
# Currently writes log messages as "print" statements to stdoutI need some advice on thisI've added a new method called "ai_log_message" to core.cpp to allow Lua to write to the "ai/engine/lua" log domain, but would like some advice on whether this is a good idea
+
* ML Recruiter 0.2: 
 +
** Logging messages changed from print statements to using lg::log_domain. 
 +
** Now have an explicit debug mode by running with --log-debug=ai/ml.
 +
** ML Recruiter can play against itself. Previously could only have ML Recruiter on one side.
 +
** Some work on ML recruiting model (i.e. the core logic).  Experimented with different training strategies, but features unchanged from 0.1.
 +
* ML Recruiter 0.3 (10/25/2012) :
 +
** New "gold yield" metric for judging a unit's goodness
 +
** Several new ML features to aid in prediction:  alignment, race, time of day, map size, friendly and enemy leader hit point percentage remaining, and nearest enemy unit to friendly leader
 +
** Runs on all 2-player maps except for Hornshark Island, Thousand Stings, Caves of the Basilisk, and Dark Forecast
 +
** Greatly improved ai_test2.py script for running thousands of games to test AI and gather data for the neural net
 +
** New script (run_model_and_make_new_model.py) for running games and building a new neural net based on the data gathered from those games
 +
** Improved performance: Defeats ML Recruiter 0.2 58% of the time
 +
* ML Recruiter 0.4 (11/11/2012):
 +
** Run on all 2-player maps (except Dark Forecast, which has a custom recruiter)
 +
** Refactor code to separate features from predicted values
 +
** Added timeout option to ai_test2.pyAlso report time statistics in analyze_log.py
 +
** Improved recruiter for the Ron recruiter.  It still underperforms the Ron recruiter on most maps when used with the other Ron CA's, though.
 +
** Move all code into [https://github.com/mattsc/Wesnoth-AI-Demos AI-Demos project on GitHub]the ML Recruiter 0.4 patch now consists, essentially, of only the C++ code modifications.
 +
* ML Recruiter 0.5 (planned)
 +
** Run on all mainline multiplayer maps
 +
** Experiment with using as the [http://forums.wesnoth.org/viewtopic.php?f=10&t=34976 AI-Demos] recruiter
 +
** Add missing special abilities (plague, leadership, and unit upkeep)
 +
** Add 95% confidence intervals to the win ratios in analyze_log.py and add measures of entropy (randomness) to analyze_recruitment.py.  Entropy is a good measure of the variety of units that a recruiter is recruiting--for game play, more is better.
  
=ML Recruiter development roadmap=
+
[[Category:AI]]

Latest revision as of 15:54, 25 October 2019

This page documents the new machine learning recruiter submitted as a patch for Wesnoth 1.11.0. We describe how to run it, discuss experiments showing that the ML Recruiter achieves dramatically better performance than the RCA AI recruiter, describe known issues and suggest a development road map.

Note that the ML Recruiter is a work in progress. We welcome feedback on it. Please discuss it on the thread "Machine Learning Recruiter" at http://forums.wesnoth.org/viewtopic.php?f=10&t=36642.

Why include ML Recruiter in Wesnoth?

The ML Recruiter makes use of a small subset of the Waffles Machine Learning toolkit adding 13 pairs of .cpp/.h files to Wesnoth. In addition, the neural nets used by ML Recruiter are serialized as .json files, which is a format Wesnoth has not yet contained. So why is this patch worthwhile?

Why the ML Recruiter will be great for Wesnoth

  1. Performance is great. ML Recruiter defeats RCA AI 66-73% of the time. We suspect that this would also translate into better performance against human opponents because it also performs better against the "ron" recruiter included in AI-Demos, which is a more challenging opponent.
  2. This superior performance is achieved with comparable "fun factor"
    • Variety of units recruited by recommended ML Recruiter is comparable to or better than RCA AI
  3. Don't need to eliminate RCA AI recruiter. Campaign designers can choose to use one or the other
  4. ML Recruiter should be easier to customize than RCA AI because
    • All core logic is in Lua, which is easier to modify than existing C++
    • Performance "out of the box" on known units likely to be strong
    • When new recruitable units are introduced by campaign designers, it can be trained by running c. 600 games in two hours. The new model is included as a .json file with the campaign data
    • Plug and play architecture of machine learning "features" easily allows minor modifications to mainline recruiter or to campaign-specific recruiters
  5. Easy way to adjust campaign difficulty: Adjusting ML AI for more/less randomness makes is easier/harder to defeat
  6. Inclusion of ML Recruiter could lead to greater publicity and more contributors to Wesnoth
    1. SeattleDad plans to submit this work as a scientific paper to a conference such as Computational Intelligence in Games
    2. Others might later build on this work by, for instance, trying ML algorithms other than neural nets, adding new features, further generalizing the algorithm, etc.
    3. The machine learning infrastructure is not specific to recruiting and could be repurposed for, for instance, attack planning, weapon selection, and making "retreat and heal" vs. "attack" decisions
    4. All of the above is potentially publishable research, so Wesnoth could attract contributions from computer science graduate students
      1. Note that, having established the basic framework with this patch, future work on machine learning will be much easier

Using ML Recruiter

Applying the patch

patch -p1  -i [path to patch file]
  • Compile Wesnoth using CMake, SCons, or XCode

Playing against the ML Recruiter

  1. From the main menu, choose "Multiplayer"
  2. Choose "Local Game"
  3. Pick a map and adjust settings as desired. ML Recruiter has been trained with the default setting for village gold and support, but it should work fine on other settings
    1. Hit Okay
  4. For one side, Choose Player/Type-->Computer Player and then either ML AI (Recommended) or ML AI (Less Variety, probably stronger)
    1. For the opponent, either play against it yourself (pick your name), watch it play the default AI (Computer Player-->RCA AI), or watch it play itself (pick ML AI again)

Watching the ML Recruiter play a single game in nogui mode

The following command is convenient for watching the ML Recruiter play a single game in nogui mode, which allows you to quickly and easily see the ML Recruiter's decision-making process. In this example, we would be running the ML AI (Recommended mode) for the Knalgan Alliance, while the default AI would be playing the Rebels. Note that when run this way (with --log-info=ai/testing,ai/ml), a lot of logging messages will be printed to the console which will describe how the ML Recruiter is analyzing its options.

  Wesnoth --log-info=ai/testing,ai/ml --nogui --multiplayer --controller 1:ai --controller 2:ai --parm 1:gold:100 --parm 2:gold:100 --parm 1:village_gold:2 --parm 2:village_gold:2 --scenario multiplayer_Weldyn_Channel --parm 1:gold:100 --parm 2:gold:100 --ai-config 1:ai/ais/ml_ai.cfg --ai-config 2:ai/dev/default_ai_with_recruit_log.cfg  --side 1:"Knalgan Alliance"  --side 2:Rebels

Testing the ML Recruiter in batch mode

Testing in batch mode is easy with the new version of ai_test2.py included in the patch. After applying the ML Recruiter patch, copy utils/ai_test/ai_test2.cfg to the directory in which you want to run the experiment. Then edit the first line of the .cfg file, "path_to_wesnoth_binary" to point to your Wesnoth executable. Then adjust faction1 and faction2 to point to the factions you want to experiment with and point ai_config1 at the ML configuration file you want to try out. Finally, to make everything easier, add the following to your path:

[Wesnoth_Install]/utils/ai_test/

Now you can test Wesnoth in batch as follows:

ai_test2.py ai_test2.cfg

Experimental Results

Win percentages for different AI Pairs

AI1 AI2 Games Win % for AI1 Comment
ML Recruiter 0.3 (Less Variety) RCA AI 1179 69.3% "Less variety/probably stronger version". Wins 73.4% of the time on the maps version 0.2 was trained on
ML Recruiter 0.3 (Recommended) RCA AI 2363 66.6% "Recommended version". Same version as above, but has more randomness in its choice of units. See documentation on weighted random recruiter
ML Recruiter 0.3 ML Recruiter 0.2 1937 58.0% We've made a lot of progress since version 0.2
ML Recruiter 0.3 Ron Recruiter 0.11.4 1186 54.1% Ron Recruit is the recruiter build into AI Demos
ML Recruiter 0.3 (Recommended) ML Recruiter 0.3 (Less variety) 1186 49.6% Difference is not statistically significant, so we pick the variant with more variety as the "Recommended" version. Note, though, that the "Less variety" version does a bit better against the RCA AI as you can see above.
RCA AI RCA AI 50% Any AI against itself will win 50% of the time
RCA AI Random 3,000 52.7% Interesting result. You would expect a completely random choice to get beat by a wider margin

Faction vs. faction win % for ML Recruiter 0.3 vs. RCA AI

These results are for the "Recommended" version

all/data/138 $ analyze_log.py *.log

Overall Stats
AI                            	Wins	Win %
default_ai_with_recruit_log   	789	33.4%
ml_ai                         	1574	66.6%
Totals:                       	2363

                                        Wins    Loss    Win %
Drakes vs Undead                 	33	39	45.8%
Drakes vs Northerners            	20	39	33.9%
Drakes vs Loyalists              	66	11	85.7%
Drakes vs Knalgan Alliance       	38	40	48.7%
Drakes vs Drakes                 	52	19	73.2%
Drakes vs Rebels                 	46	2	95.8%
Total Drakes                     	255	150	63.0%

Knalgan Alliance vs Undead       	48	19	71.6%
Knalgan Alliance vs Northerners  	23	48	32.4%
Knalgan Alliance vs Loyalists    	40	14	74.1%
Knalgan Alliance vs Knalgan Alliance	36	23	61.0%
Knalgan Alliance vs Drakes       	34	26	56.7%
Knalgan Alliance vs Rebels       	46	21	68.7%
Total Knalgan Alliance           	227	151	60.1%

Loyalists vs Undead              	32	41	43.8%
Loyalists vs Northerners         	17	48	26.2%
Loyalists vs Loyalists           	58	5	92.1%
Loyalists vs Knalgan Alliance    	53	19	73.6%
Loyalists vs Drakes              	52	10	83.9%
Loyalists vs Rebels              	30	29	50.8%
Total Loyalists                  	242	152	61.4%

Northerners vs Undead            	59	6	90.8%
Northerners vs Northerners       	49	21	70.0%
Northerners vs Loyalists         	51	9	85.0%
Northerners vs Knalgan Alliance  	60	7	89.6%
Northerners vs Drakes            	56	14	80.0%
Northerners vs Rebels            	55	5	91.7%
Total Northerners                	330	62	84.2%

Rebels vs Undead                 	42	14	75.0%
Rebels vs Rebels                 	51	15	77.3%
Rebels vs Loyalists              	71	8	89.9%
Rebels vs Knalgan Alliance       	50	15	76.9%
Rebels vs Drakes                 	44	22	66.7%
Rebels vs Northerners            	21	40	34.4%
Total Rebels                     	279	114	71.0%

Undead vs Undead                 	41	37	52.6%
Undead vs Northerners            	25	55	31.2%
Undead vs Loyalists              	51	18	73.9%
Undead vs Knalgan Alliance       	57	6	90.5%
Undead vs Drakes                 	41	9	82.0%
Undead vs Rebels                 	26	35	42.6%
Total Undead                     	241	160	60.1%

Units recruited by Recommended ML Recruiter

Results are for Recommended AI vs. RCA AI for ML Recruiter 0.3

Grand Totals
Drakes Recruitment            	Count	%
Drake Burner                  	2925	28.3%
Drake Clasher                 	909	8.8%
Drake Fighter                 	2159	20.9%
Drake Glider                  	1156	11.2%
Saurian Augur                 	2197	21.3%
Saurian Skirmisher            	985	9.5%
Total:                        	10331

Knalgan Alliance Recruitment  	Count	%
Dwarvish Fighter              	1535	14.4%
Dwarvish Guardsman            	543	5.1%
Dwarvish Thunderer            	4171	39.0%
Dwarvish Ulfserker            	767	7.2%
Footpad                       	1573	14.7%
Gryphon Rider                 	847	7.9%
Poacher                       	677	6.3%
Thief                         	576	5.4%
Total:                        	10689

Loyalists Recruitment         	Count	%
Bowman                        	1799	15.6%
Cavalryman                    	551	4.8%
Fencer                        	345	3.0%
Heavy Infantryman             	1109	9.6%
Horseman                      	930	8.0%
Mage                          	934	8.1%
Merman Fighter                	852	7.4%
Spearman                      	5040	43.6%
Total:                        	11560

Northerners Recruitment       	Count	%
Goblin Spearman               	181	1.4%
Naga Fighter                  	504	3.9%
Orcish Archer                 	2480	19.1%
Orcish Assassin               	1842	14.2%
Orcish Grunt                  	2524	19.5%
Troll Whelp                   	4833	37.3%
Wolf Rider                    	608	4.7%
Total:                        	12972

Rebels Recruitment            	Count	%
Elvish Archer                 	1833	17.6%
Elvish Fighter                	3213	30.8%
Elvish Scout                  	1086	10.4%
Elvish Shaman                 	222	2.1%
Mage                          	435	4.2%
Merman Hunter                 	768	7.4%
Wose                          	2865	27.5%
Total:                        	10422

Undead Recruitment            	Count	%
Dark Adept                    	2460	20.7%
Ghost                         	827	7.0%
Ghoul                         	1474	12.4%
Skeleton                      	2386	20.1%
Skeleton Archer               	3772	31.7%
Vampire Bat                   	668	5.6%
Walking Corpse                	308	2.6%
Total:                        	11895

Units recruited by Recommended ML Recruiter vs Undead

As a breakdown of the above, it's interesting to look at the different unit blends that ML Recruiter 0.3 selects vs. the Undead as opposed to the overall totals shown above. MLR's RCA AI opponent recruits a unit blend which consists of just the following four units:

RCA AI Recruitment for Undead:

Undead Recruitment            	Count	%
Dark Adept                    	3163	28.4%
Ghost                         	2451	22.0%
Skeleton                      	3574	32.1%
Skeleton Archer               	1941	17.4%
Total:                        	11129

ML Recruiter 0.3 units recruited against the RCA AI Undead. Notice the large increase in the number of units with impact and fire attacks, which would be effective against Skeletons and the decrease in Orcish Assassins and Ghouls, which are ineffective against every Undead unit except Dark Adepts.

Results for enemy faction:Undead
	Drakes Recruitment            	Count	%
	Drake Burner                  	666	37.8%
	Drake Clasher                 	37	2.1%
	Drake Fighter                 	478	27.1%
	Drake Glider                  	339	19.2%
	Saurian Augur                 	91	5.2%
	Saurian Skirmisher            	152	8.6%
	Total:                        	1763

	Knalgan Alliance Recruitment  	Count	%
	Dwarvish Fighter              	352	17.2%
	Dwarvish Guardsman            	31	1.5%
	Dwarvish Thunderer            	259	12.7%
	Dwarvish Ulfserker            	170	8.3%
	Footpad                       	945	46.2%
	Gryphon Rider                 	153	7.5%
	Poacher                       	66	3.2%
	Thief                         	70	3.4%
	Total:                        	2046

	Loyalists Recruitment         	Count	%
	Bowman                        	125	6.7%
	Cavalryman                    	45	2.4%
	Fencer                        	65	3.5%
	Heavy Infantryman             	533	28.6%
	Horseman                      	20	1.1%
	Mage                          	538	28.8%
	Merman Fighter                	103	5.5%
	Spearman                      	437	23.4%
	Total:                        	1866

	Northerners Recruitment       	Count	%
	Goblin Spearman               	24	1.1%
	Naga Fighter                  	60	2.8%
	Orcish Archer                 	778	36.9%
	Orcish Assassin               	43	2.0%
	Orcish Grunt                  	202	9.6%
	Troll Whelp                   	952	45.1%
	Wolf Rider                    	51	2.4%
	Total:                        	2110

	Rebels Recruitment            	Count	%
	Elvish Archer                 	92	6.9%
	Elvish Fighter                	265	19.9%
	Elvish Scout                  	83	6.2%
	Elvish Shaman                 	50	3.8%
	Mage                          	176	13.2%
	Merman Hunter                 	41	3.1%
	Wose                          	625	46.9%
	Total:                        	1332

	Undead Recruitment            	Count	%
	Dark Adept                    	548	23.1%
	Ghost                         	56	2.4%
	Ghoul                         	153	6.4%
	Skeleton                      	777	32.7%
	Skeleton Archer               	669	28.1%
	Vampire Bat                   	80	3.4%
	Walking Corpse                	94	4.0%
	Total:                        	2377

How the ML Recruiter works

When it's deciding what to recruit, the ML Recruiter works by predicting a "metric" which is a measure of how well a given unit will do in the game in the current situation. A good measure of a unit's usefulness is a tricky question and we will discuss three different metrics below, but let's start with the easiest one, which is the sum of the following quantities for each unit:

  1. Experience points at the end of the game or when the unit is killed
  2. Number of villages captured by the unit

Note that this metric is blind to other ways a unit can help you (in particular, it doesn't know about poison, healing, and slowing).

This sum, which we'll call the "metric" is then divided by the unit cost to get metric/cost (think of this as goodness per unit of gold). You can see this in the debugging output that the ML Recruiter prints to stderr when run with the flag --log-info=ai/testing,ai/ml:

unit type               metric  cost    wt cost weighted metric
Elvish Shaman           8.58    15      15.00   0.57
Elvish Fighter          10.42   14      14.00   0.74
Elvish Scout            8.47    18      18.00   0.47
Wose                    17.07   20      20.00   0.85
Mage                    10.37   20      20.00   0.52
Elvish Archer           8.46    17      17.00   0.50 
Merman Hunter           8.55    15      15.00   0.57

This is from the first turn of a game between the Rebels and the Undead. The ML Recruiter is predicting that if it recruits a Wose now, it will end with 17.07 XP + Village Captures. 17.07/20 = 0.85, which is the highest weighted metric at this time, so it picks a Wose as it's top choice.

How does it know to pick a Wose? It looks at the "features" which describe the current situation. Here's another chart from the same game:

unit type               metric  cost    wt cost weighted metric
Elvish Shaman           7.18    15      15.00   0.48
Elvish Fighter          11.82   14      14.00   0.84
Elvish Scout            7.91    18      18.00   0.44
Wose                    15.53   20      20.00   0.78
Mage                    9.11    20      20.00   0.46
Elvish Archer           9.38    17      17.00   0.55
Merman Hunter           8.36    15      15.00   0.56
Side: 1 Gold: 21 Unit we want: Elvish Fighter
PRERECRUIT:, enemy Dark Adept:1 , enemy Deathblade:1 , enemy Ghost:2 , enemy Skeleton:3 , enemy faction:Undead , 
enemy gold:10 , enemy level3+:0 , enemy total-gold:139 , enemy unit-gold:129 , friendly Elvish Captain:1 , 
friendly Elvish Fighter:1 , friendly Wose:4 , friendly faction:Rebels , friendly gold:21 , friendly level3+:0 , 
friendly total-gold:161 , friendly unit-gold:140 , side:1 , terrain-forest:0.082 , terrain-mountain-hill:0.113 , 
terrain-water-swamp:0.164 , total-gold-ratio:0.537 , turn:4 , village-control-margin:-2 , village-control-ratio:0.417 , village-enemy:7 , 
village-friendly:5 , village-neutral:4 ,

The "features" that it sees are the values following "PRERECRUIT". The ML AI sees that the enemy faction is the Undead and that they have one Deathblade, two ghosts, three Skeletons. The Rebels currently have 4 Wose, 1 Elvish Fighter, and 1 Elvish Captain. It also sees a number of other features like how much gold it and its opponent have, what percentage of each the map is covered by different terrain, and how many friendly, neutral, and enemy villages there are. In this situation, although it still sees that the Wose is likely to score higher on the XP + village capture metric (15.5 vs. 11.8), this isn't enough to overcome the price differential, so it chooses an Elvish Fighter as it's best choice with a weighted metric of 0.84.

Note that these predictions of 15.5 vs. 11.8 are computed by the neural net based on a model built from what the algorithm has seen has happened in similar situations during training.

Unit Goodness Metrics

We have experimented with three different unit goodness metrics. All of these metrics are designed to have the property that the higher the value of the metric, the better the unit performed in a a given game. Clearly there is a random element here. In some games when playing against a Skeleton-heavy Undead army, an Elvish Archer, which uses mainly a pierce attack, may get lucky and do better than a Wose, which has an impact attack, but on average the metric should show that the Wose performs better.

The three metrics we've looked at are as follows:

Experience Point plus Village Capture

This is the metric used in ML Recruiter 0.2. As noted above, it is the sum of the following quantities:

  1. Experience points at the end of the game or when the unit is killed
  2. Number of villages captured by the unit

This metric has the advantage that experience points lead to promotion, which is a very good thing. Also, getting kills should be correlated with how much damage the unit is doing. Adding village captures to experience points is a little flaky, but is intended to give credit to fast units, which are more likely to capture villages.

Victory

This metric gives a unit 1.0 if its side wins and 0 if its side loses. The effect is that the neural net's prediction for each unit can be seen as "what is the probability of victory if I recruit this unit in this situation". This is the most natural of all metrics, but experimentally it hasn't performed as well in terms of leading to actual victories as recruiting based on unit-based metrics. Performance has peaked at around a 59% win ratio for a victory metric vs. around 66 - 73% for the XP+VC metric. We think the problem is that the impact of recruiting a single unit of Type A vs. Type B on the victory probability is very small, so the neural net isn't differentiating among the choices enough.

Gold Yield

As of ML Recruiter 0.3, this is the new default metric. It is the sum of the following quantities, all of which are intended to quantify a unit's usefulness in terms of how much gold benefit it has yielded for the friendly side plus gold damage done to the enemy side. This metric builds off of a suggestion from Sapient.

  1. Basic Damage Metric: Target unit cost * (Damage inflicted/target max HP). The concept is that you cost your opponent this much gold by destroying this fraction of the unit. Obviously in any given attack, we would calculate this for both the attacker and the defender.
  2. Village Capture: capturing_unit.variables.ml_gold_yield += wesnoth.game_config.village_income. (Defaults to crediting 2 gold per village capture)
    • The idea, again, is that fast units tend to get more captures than slow units and this gives units credit for being fast.
  3. Poison: Treated the same as Basic Damage Metric by crediting for the amount of damage done in that turn. On the turn in which the unit is cured, the poisoner is credited with Target Unit Cost * (8/target max HP) to reflect the damage that it would have healed if it hadn't been poisoned (obviously, lessened if it has less than 8 HP of damage)
  4. Slowing: When a unit is on defense and it slows the attacker, the defender gets no special credit because the attacker just unslows at the end of its turn. When you slow a unit as the attacker, the slowing unit gets credit for the Basic Damage Metric accumulated by the slowed unit until it unslows (the slowed unit would otherwise have done twice as much damage, so you get credit for the damage it didn't do)
  5. Healing: healing_unit.variables.ml_gold_yield += Healed_unit_cost * (healed amount/healed unit max HP)
    • Directly analogous to the Basic Damage Metric
    • Note that healers also get credit for curing/stopping poison
  6. Walking Corpse Creation: Credit a unit which gets a kill which creates a unit due to its plague ability with 8 gold (the value of a Walking Corpse). (not implemented)
  7. Leadership: Credit the leader for the bonus damage inflicted by the unit being led (not implemented)
  8. Maintenance: Charge units for their share of the maintenance costs, weighted by level. Hence, level 0 units never pay maintenance. Level 2 units pay for twice as much maintenance. (not implemented)

How MLR makes weighted random choices

The recommended recruiter is defined in ai/ais/ml_ai.cfg. It is called "Recommended" in the user interface. Although we are currently measuring it as performing roughly the same or slightly worse than the "Less variety/probably stronger" ML AI (ai/ais/ml_ai_less_random.cfg), we recommend it because it allows the player to see a greater variety of opposing units.

The weighted random printout looks like the following:

Turn 13:
unit type               metric  cost    metric/ weighted        %
                                        cost    m/c             of total
Merman Hunter           1.10    15      0.07    0.0000          0.0%
Wose                    1.72    20      0.09    0.0000          0.1%
Elvish Shaman           1.66    15      0.11    0.0000          0.4%
Elvish Archer           2.71    17      0.16    0.0000          4.0%
Elvish Fighter          2.54    14      0.18    0.0000          8.5%
Mage                    4.18    20      0.21    0.0001          20.1%
Elvish Scout            4.60    18      0.26    0.0003          66.8%
Random Number chosen was        376
Side: 1 Gold: 27 Unit we want: Elvish Scout
PRERECRUIT:, enemy Dark Adept:2 , enemy Revenant:1 , enemy Skeleton:1 , enemy faction:Undead , enemy gold:8 , 
enemy level3+:0 , enemy total-gold:83 , enemy unit-gold:75 , friendly Elder Wose:1 , friendly Elvish Fighter:2 , 
friendly Elvish Ranger:1 , friendly Mage:1 , friendly Wose:3 , friendly faction:Rebels , friendly gold:27 , 
friendly level3+:0 , friendly total-gold:225 , friendly unit-gold:198 , side:1 , terrain-forest:0.082 , 
terrain-mountain-hill:0.113 , terrain-water-swamp:0.164 , total-gold-ratio:0.731 , turn:13 , 
village-control-margin:0 , village-control-ratio:0.5 , village-enemy:8 , village-friendly:8 , village-neutral:0 ,

This situation occurs towards the end of a game that the Rebels are winning. Note that total-gold-ratio (the ratio between the sum of gold + the value of all units on each side) is 0.731, so it's heavily in the Rebels' favor. The ML AI sees an Elvish Scout as being the best choice in this situation with a Mage in second place. The Elvish Scout is probably favored because the game is likely to be won rapidly and only a fast unit will be able to reach the enemy or reach a village fast enough to add to its "experience points + village capture" metric.

The Weighted Random does the following:

  1. It takes every metric/cost value and raises it to the sixth power. Why? We want to magnify the differences. In this example 0.26/0.21 = 1.23, but (0.26**6)/(0.21**6) = 3.60.
  2. We then randomly choose a unit with a probability proportional to this weighted value which, in this case, was an Elvish Scout.

The Less Variety/Probably Stronger AI does the same thing, but raises metric/cost to the 24th power instead of the 6th power. This still allows for some randomness, but weights the selection much more strongly towards the more favored units.

How to train your own ML Recruiter

utils/ai_test/run_model_and_make_new_model.py is an end-to-end script for running a whole bunch of training games of Wesnoth and then training a new model based on the data output by that run. Documentation on this script can be seen by running

run_model_and_make_new_model.py --help

Note that this script assumes that the Waffles machine learning toolkit is installed and that waffles/bin/ is in your path.

Known issues

Bugs

  1. Haven't added new Waffles files to Visual C++, so it won't compile under VC++. I need some help with this.
  2. The default for multiplayer games is that units require only 70% of normal experience to promote, however when a game is run from the command line, it always requires normal 100% of experience to promote. Consequently MLR doesn't see units promote as much as they should in training, which would slightly distort its training data. This is a limitation of Wesnoth, not MLR.

Current Limitations

  1. Only tested on two-player multiplayer games. Doesn't work when there are more than two leaders on the map.
  2. Works on all two-player maps except for Hornshark Island, Thousand Stings, Caves of the Basilisk, and Dark Forecast
  3. As noted above in Gold Yield Metric, we account for all special abilities available in the main-line multiplayer scenarios except for plague, leadership, and unit maintenance costs

ML Recruiter development roadmap

  • ML Recruiter 0.1: Initial drop
  • ML Recruiter 0.1.1: Minor retraining of the model
  • ML Recruiter 0.2:
    • Logging messages changed from print statements to using lg::log_domain.
    • Now have an explicit debug mode by running with --log-debug=ai/ml.
    • ML Recruiter can play against itself. Previously could only have ML Recruiter on one side.
    • Some work on ML recruiting model (i.e. the core logic). Experimented with different training strategies, but features unchanged from 0.1.
  • ML Recruiter 0.3 (10/25/2012) :
    • New "gold yield" metric for judging a unit's goodness
    • Several new ML features to aid in prediction: alignment, race, time of day, map size, friendly and enemy leader hit point percentage remaining, and nearest enemy unit to friendly leader
    • Runs on all 2-player maps except for Hornshark Island, Thousand Stings, Caves of the Basilisk, and Dark Forecast
    • Greatly improved ai_test2.py script for running thousands of games to test AI and gather data for the neural net
    • New script (run_model_and_make_new_model.py) for running games and building a new neural net based on the data gathered from those games
    • Improved performance: Defeats ML Recruiter 0.2 58% of the time
  • ML Recruiter 0.4 (11/11/2012):
    • Run on all 2-player maps (except Dark Forecast, which has a custom recruiter)
    • Refactor code to separate features from predicted values
    • Added timeout option to ai_test2.py. Also report time statistics in analyze_log.py
    • Improved recruiter for the Ron recruiter. It still underperforms the Ron recruiter on most maps when used with the other Ron CA's, though.
    • Move all code into AI-Demos project on GitHub. the ML Recruiter 0.4 patch now consists, essentially, of only the C++ code modifications.
  • ML Recruiter 0.5 (planned)
    • Run on all mainline multiplayer maps
    • Experiment with using as the AI-Demos recruiter
    • Add missing special abilities (plague, leadership, and unit upkeep)
    • Add 95% confidence intervals to the win ratios in analyze_log.py and add measures of entropy (randomness) to analyze_recruitment.py. Entropy is a good measure of the variety of units that a recruiter is recruiting--for game play, more is better.
This page was last edited on 25 October 2019, at 15:54.