Machine Learning Recruiter

From The Battle for Wesnoth Wiki
Revision as of 05:45, 26 October 2012 by SeattleDad (talk | contribs) (Relative winning percentages)

This page documents the new machine learning recruiter submitted as a patch for Wesnoth 1.11. We describe how to run it, discuss experiment showing that the ML Recruiter achieves dramatically better performance than the RCA AI recruiter, describe known issues and suggest a development road map.

Note that the ML Recruiter is a work in progress. We welcome feedback on it. Please discuss it on the thread "Machine Learning Recruiter" at http://forums.wesnoth.org/viewtopic.php?f=10&t=36642.

Why include ML Recruiter in Wesnoth?

The ML Recruiter makes use of a small subset of the Waffles Machine Learning toolkit adding 13 pairs of .cpp/.h files to Wesnoth. In addition, the neural nets used by ML Recruiter are serialized as .json files, which is a format Wesnoth has not yet contained. So why is this patch worthwhile?

Why the ML Recruiter will be great for Wesnoth

  1. Performance is great. ML Recruiter defeats RCA AI 69-73% of the time. We suspect that this would also translate into better performance against human opponents because it also performs better against the "ron" recruiter included in AI-Demos, which is a more challenging opponent.
  2. This superior performance is achieved with comparable "fun factor"
    • Variety of units recruited by recommended ML Recruiter is comparable to or better than RCA AI
  3. Don't need to eliminate RCA AI recruiter. Campaign designers can choose to use one or the other
  4. ML Recruiter should be easier to customize than RCA AI because
    • All core logic is in Lua, which is easier to modify than existing C++
    • Performance "out of the box" on known units likely to be strong
    • When new recruitable units are introduced by campaign designers, it can be trained by running c. 600 games in two hours. The new model is included as a .json file with the campaign data
    • Plug and play architecture of machine learning "features" easily allows minor modifications to mainline recruiter or to campaign-specific recruiters
  5. Easy way to adjust campaign difficulty: Adjusting ML AI for more/less randomness makes is easier/harder to defeat
  6. Inclusion of ML Recruiter could lead to greater publicity and more contributors to Wesnoth
    1. SeattleDad plans to submit this work as a scientific paper to a conference such as Computational Intelligence in Games
    2. Others might later build on this work by, for instance, trying ML algorithms other than neural nets, adding new features, further generalizing the algorithm, etc.
    3. The machine learning infrastructure is not specific to recruiting and could be repurposed for, for instance, attack planning, weapon selection, and making "retreat and heal" vs. "attack" decisions
    4. All of the above is potentially publishable research, so Wesnoth could attract contributions from computer science graduate students
      1. Note that, having established the basic framework with this patch, future work on machine learning will be much easier

Using ML Recruiter

Applying the patch

patch -p1  -i [path to patch file]
  • Compile Wesnoth using CMake, SCons, or XCode

Playing against the ML Recruiter

  1. From the main menu, choose "Multiplayer"
  2. Choose "Local Game"
  3. Pick a map and adjust settings as desired. ML Recruiter has been trained with the default setting for village gold and support, but it should work fine on other settings
    1. Hit Okay
  4. For one side, Choose Player/Type-->Computer Player and then either ML AI (Recommended) or ML AI (Less Variety, probably stronger)
    1. For the opponent, either play against it yourself (pick your name), watch it play the default AI (Computer Player-->RCA AI), or watch it play itself (pick ML AI again)

Watching the ML Recruiter play a single game in nogui mode

The following command is convenient for watching the ML Recruiter play a single game in nogui mode, which allows you to quickly and easily see the ML Recruiter's decision-making process. In this example, we would be running the ML AI (Recommended mode) for the Knalgan Alliance, while the default AI would be playing the Rebels. Note that when run this way (with --log-info=ai/testing,ai/m), a lot of logging messages will be printed to the console which will describe how the ML Recruiter is analyzing its options.

  Wesnoth --log-info=ai/testing,ai/ml --nogui --multiplayer --controller 1:ai --controller 2:ai --parm 1:gold:100 --parm 2:gold:100 --parm 1:village_gold:2 --parm 2:village_gold:2 --scenario multiplayer_Weldyn_Channel --parm 1:gold:100 --parm 2:gold:100 --ai-config 1:ai/ais/ml_ai.cfg --ai-config 2:ai/dev/default_ai_with_recruit_log.cfg  --side 1:"Knalgan Alliance"  --side 2:Rebels

Testing the ML Recruiter in batch mode

Testing in batch mode is easy with the new version of ai_test2.py included in the patch. After applying the ML Recruiter patch, copy utils/ai_test/ai_test2.cfg to the directory in which you want to run the experiment. Then edit the first line of the .cfg file, "path_to_wesnoth_binary" to point to your Wesnoth executable. Then adjust faction1 and faction2 to point to the factions you want to experiment with and point ai_config1 at the ML configuration file you want to try out. Finally, to make everything easier, add the following to your path:

[Wesnoth_Install]/utils/ai_test/

Now you can test Wesnoth in batch as follows:

ai_test2.py ai_test2.cfg

Experiments: ML Recruiter vs. RCA AI Recruiter

Relative winning percentages

  • RCA AI: The Default AI. Wins 50% of the time against itself (of course)
  • Random: Recruiting units are chosen completely at random. Wins 45.5% of the time overall
  • Recommended ML Recruiter: Units are chosen at random weighted by their relative value. Wins 70% overall.
  • Pure ML Recruiter: ML AI always chooses the unit it thinks is best. Wins 71.5% of the time overall, but is might be seen as boring since it can produce armies which are overwhelmingly one or two units.
AI1 AI2 Games Win % for AI1
ML Recruiter 0.3 RCA AI 1179 69.3%
ML Recruiter 0.3 ML Recruiter 0.2 1937 58.0%
ML Recruiter 0.3 Ron Recruit 1186 54.1%
ML Recruiter 0.3 (Recommended) ML Recruiter 0.3 (Less variety) 1186 49.6%

Winning percentages for Recommended ML Recruiter

$ analyze_log.py *.log 
Overall Stats 
Win %	Wins	AI
30.0%	1076	"default_ai_with_recruit_log
70.0%	2506	"ml_ai_faction
Total:	3582
                                        Win     Lose    Win %
Drakes vs Undead                 	37	43	46.2%
Drakes vs Northerners            	44	75	37.0%
Drakes vs Loyalists              	69	17	80.2%
Drakes vs Knalgan Alliance       	52	54	49.1%
Drakes vs Drakes                 	66	40	62.3%
Drakes vs Rebels                 	77	17	81.9%
Total Drakes                     	345	246	58.4%
Knalgan Alliance vs Undead       	74	19	79.6%
Knalgan Alliance vs Northerners  	29	73	28.4%
Knalgan Alliance vs Loyalists    	93	16	85.3%
Knalgan Alliance vs Knalgan Alliance	67	37	64.4%
Knalgan Alliance vs Drakes       	83	39	68.0%
Knalgan Alliance vs Rebels       	75	18	80.6%
Total Knalgan Alliance           	421	202	67.6%
Loyalists vs Undead              	25	73	25.5%
Loyalists vs Northerners         	24	67	26.4%
Loyalists vs Loyalists           	73	20	78.5%
Loyalists vs Knalgan Alliance    	57	50	53.3%
Loyalists vs Drakes              	61	27	69.3%
Loyalists vs Rebels              	55	48	53.4%
Total Loyalists                  	295	285	50.9%
Northerners vs Undead            	91	5	94.8%
Northerners vs Northerners       	72	26	73.5%
Northerners vs Loyalists         	91	2	97.8%
Northerners vs Knalgan Alliance  	83	2	97.6%
Northerners vs Drakes            	67	12	84.8%
Northerners vs Rebels            	107	6	94.7%
Total Northerners                	511	53	90.6%
Rebels vs Undead                 	81	19	81.0%
Rebels vs Northerners            	34	64	34.7%
Rebels vs Loyalists              	97	6	94.2%
Rebels vs Knalgan Alliance       	92	23	80.0%
Rebels vs Drakes                 	68	32	68.0%
Rebels vs Rebels                 	84	14	85.7%
Total Rebels                     	456	158	74.3%
Undead vs Undead                 	65	29	69.1%
Undead vs Northerners            	39	42	48.1%
Undead vs Loyalists              	108	9	92.3%
Undead vs Knalgan Alliance       	94	9	91.3%
Undead vs Drakes                 	90	19	82.6%
Undead vs Rebels                 	82	24	77.4%
Total Undead                     	478	132	78.4%


Units recruited by Recommended ML Recruiter

Unit recruitment statistics for Drakes
%	Number	Unit		
18.5%	2256	Drake Burner	
6.9%	841	Drake Clasher	
15.2%	1853	Drake Fighter	
19.8%	2416	Drake Glider	
16.4%	1994	Saurian Augur	
23.2%	2821	Saurian Skirmisher	
Total:	12181
Unit recruitment statistics for Knalgan Alliance
%	Number	Unit		        
16.1%	2484	Dwarvish Fighter	
4.7%	725	Dwarvish Guardsman	
19.2%	2963	Dwarvish Thunderer	
5.1%	782	Dwarvish Ulfserker	
26.1%	4020	Footpad	                
2.5%	379	Gryphon Rider	        
12.1%	1862	Poacher	                
14.1%	2178	Thief	                
Total:	15393
Unit recruitment statistics for Loyalists
%	Number	Unit		
25.7%	3814	Bowman	
4.3%	646	Cavalryman	
13.0%	1929	Fencer	
6.8%	1011	Heavy Infantryman	
7.4%	1093	Horseman	
2.1%	308	Mage	
10.4%	1546	Merman Fighter	
30.4%	4522	Spearman	
Total:	14869
Unit recruitment statistics for Northerners
%	Number	Unit		
9.3%	1394	Goblin Spearman	
1.6%	242	Naga Fighter	
23.7%	3544	Orcish Archer	
1.1%	168	Orcish Assassin	
20.2%	3026	Orcish Grunt	
39.4%	5901	Troll Whelp	
4.6%	692	Wolf Rider	
Total:	14967
Unit recruitment statistics for Rebels
%	Number	Unit		
15.4%	2151	Elvish Archer	
40.9%	5730	Elvish Fighter	
11.3%	1576	Elvish Scout	
2.1%	294	Elvish Shaman	
3.5%	496	Mage	
3.7%	523	Merman Hunter	
23.1%	3236	Wose	
Total:	14006
Unit recruitment statistics for Undead
%	Number	Unit		
16.7%	2696	Dark Adept	
9.2%	1475	Ghost
1.1%	180	Ghoul	
43.2%	6966	Skeleton Archer	
13.7%	2206	Skeleton	
9.2%	1475	Vampire Bat	
6.9%	1113	Walking Corpse	
Total:	16111

How the ML Recruiter works

When it's deciding what to recruit, the ML Recruiter works by predicting a "metric" which is a measure of how well a given unit will do in the game in the current situation. A good measure of a unit's usefulness is a tricky question and we will discuss three different metrics below, but let's start with the easiest one, which is the sum of the following quantities for each unit:

  1. Experience points at the end of the game or when the unit is killed
  2. Number of villages captured by the unit

Note that the ML Recruiter is blind to other ways a unit can help you (in particular, it doesn't know about poison, healing, and slowing).

This sum, which we'll call the "metric" is then divided by the unit cost to get metric/cost (think of this as goodness per unit of gold). You can see this in the debugging output that the ML Recruiter current prints to stdout:

unit type               metric  cost    wt cost weighted metric
Elvish Shaman           8.58    15      15.00   0.57
Elvish Fighter          10.42   14      14.00   0.74
Elvish Scout            8.47    18      18.00   0.47
Wose                    17.07   20      20.00   0.85
Mage                    10.37   20      20.00   0.52
Elvish Archer           8.46    17      17.00   0.50 
Merman Hunter           8.55    15      15.00   0.57

This is from the first turn of a game between the Rebels and the Undead. The ML Recruiter is predicting that if it recruits a Wose now, it will end with 17.07 XP + Village Captures. 17.07/20 = 0.85, which is the highest weighted metric at this time, so it picks a Wose as it's top choice.

How does it know to pick a Wose? It looks at the "features" which describe the current situation. Here's another chart from the same game:

unit type               metric  cost    wt cost weighted metric
Elvish Shaman           7.18    15      15.00   0.48
Elvish Fighter          11.82   14      14.00   0.84
Elvish Scout            7.91    18      18.00   0.44
Wose                    15.53   20      20.00   0.78
Mage                    9.11    20      20.00   0.46
Elvish Archer           9.38    17      17.00   0.55
Merman Hunter           8.36    15      15.00   0.56
Side: 1 Gold: 21 Unit we want: Elvish Fighter
PRERECRUIT:, enemy Dark Adept:1 , enemy Deathblade:1 , enemy Ghost:2 , enemy Skeleton:3 , enemy faction:Undead , 
enemy gold:10 , enemy level3+:0 , enemy total-gold:139 , enemy unit-gold:129 , friendly Elvish Captain:1 , 
friendly Elvish Fighter:1 , friendly Wose:4 , friendly faction:Rebels , friendly gold:21 , friendly level3+:0 , 
friendly total-gold:161 , friendly unit-gold:140 , side:1 , terrain-forest:0.082 , terrain-mountain-hill:0.113 , 
terrain-water-swamp:0.164 , total-gold-ratio:0.537 , turn:4 , village-control-margin:-2 , village-control-ratio:0.417 , village-enemy:7 , 
village-friendly:5 , village-neutral:4 ,

The "features" that it sees are the values following "PRERECRUIT". The ML AI sees that the enemy faction is the Undead and that they have one Deathblade, two ghosts, three Skeletons. The Rebels currently have 4 Wose, 1 Elvish Fighter, and 1 Elvish Captain. It also sees a number of other features like how much gold it and its opponent have, what percentage of each the map is covered by different terrain, and how many friendly, neutral, and enemy villages there are. In this situation, although it still sees that the Wose is likely to score higher on the XP + village capture metric (15.5 vs. 11.8), this isn't enough to overcome the price differential, so it chooses an Elvish Fighter as it's best choice with a weighted metric of 0.84.

Note that these predictions of 15.5 vs. 11.8 are computed by the neural net based on a model built from what the algorithm has seen has happened in similar situations during training.

Unit Goodness Metrics

We have experimented with three different unit goodness metrics. All of these metrics are designed to have the property that the higher the value of the metric, the better the unit performed in a a given game. Clearly there is a random element here. In some games when playing against a Skeleton-heavy Undead army, an Elvish Archer, which uses mainly a pierce attack may get lucky and do better than a Wose, which has an impact attack, but on average the metric should show that the Wose performs better.

The three metrics we've looked at are as follows:

Experience Point plus Village Capture

This is the metric used in ML Recruiter 0.2. As noted above, it is the sum of the following quantities:

  1. Experience points at the end of the game or when the unit is killed
  2. Number of villages captured by the unit

This metric has the advantage that experience points lead to promotion, which is a very good thing. Also, getting kills should be correlated with how much damage the unit is doing. Adding village captures to experience points is a little flaky, but is intended to give credit to fast units, which are more likely to capture villages.

Victory

This metric gives a unit 1.0 if its side wins and 0 if its side loses. The effect is that the neural net's prediction for each unit can be seen as "what is the probability of victory if I recruit this unit in this situation". This is the most natural of all metrics, but experimentally it hasn't performed as well in terms of leading to actual victories as recruiting based on unit-based metrics. Performance has peaked at around a 59% win ratio for a victory metric vs. around 68 - 71% for the XP+VC metric. We think the problem is that the impact of recruiting a single unit of Type A vs. Type B on the victory probability is very small, so the neural net isn't differentiating among the choices enough.

Gold Yield

This metric is currently under development and is slated for ML Recruiter 0.3. It is the sum of the following quantities, all of which are intended to quantify a unit's usefulness in terms of how much gold benefit it has yielded for the friendly side plus gold damage done to the enemy side. This metric builds off of a suggestion from Sapient.

  1. Basic Damage Metric: Target unit cost * (Damage inflicted/target max HP). The concept is that you cost your opponent this much gold by destroying this fraction of the unit. Obviously in any given attack, we would calculate this for both the attacker and the defender.
  2. Village Capture: capturing_unit.variables.ml_gold_yield += wesnoth.game_config.village_income.
    • The idea, again, is that fast units tend to get more captures than slow units at all stages of the game and this may be a way to give units credit for being fast.
  3. Poison: Treat the same as Basic Damage Metric by crediting for the amount of damage done in that turn. On the turn in which the unit is cured, credit the poisoner with Target Unit Cost * (8/target max HP) to reflect the damage that it would have healed if it hadn't been poisoned (obviously, lessened if it has less than 8 HP of damage)
  4. Slowing: When a unit is on defense and it slows the attacker, give no extra credit to the defender because the attacker just unslows at the end of its turn. When you slow a unit as the attacker, the slowing unit gets credit for the Basic Damage Metric accumulated by the slowed unit until it unslows (the slowed unit would otherwise have done twice as much damage, so you get credit for the damage it didn't do)
  5. Healing: healing_unit.variables.ml_gold_yield += Healed_unit_cost * (healed amount/healed unit max HP)
    • Directly analogous to the Basic Damage Metric
  6. Walking Corpse Creation: Credit a unit which gets a kill which creates a unit due to its plague ability with 8 gold (the value of a Walking Corpse)
  7. Leadership: Credit the leader for the bonus damage inflicted by the unit being led
  8. Maintenance: Level 0 units should get some sort of credit for the fact that they don't require maintenance. Not sure exactly how to handle this.

The Weighted Random (Recommended) Recruiter

The recommended recruiter is defined in ai/ais/ml_ai_faction_specific_weighted_random.cfg. It is called "Recommended" in the user interface. Although we are currently measuring it as performing slightly worse than "Pure" ML AI (ai/ais/ml_ai_faction_specific.cfg), (70% win percentage vs. 71.5% for the Pure version), we recommend it because it allows the player to see a greater variety of opposing units.

The weighted random printout looks like the following:

Turn 13:
unit type               metric  cost    metric/ weighted        %
                                        cost    m/c             of total
Merman Hunter           1.10    15      0.07    0.0000          0.0%
Wose                    1.72    20      0.09    0.0000          0.1%
Elvish Shaman           1.66    15      0.11    0.0000          0.4%
Elvish Archer           2.71    17      0.16    0.0000          4.0%
Elvish Fighter          2.54    14      0.18    0.0000          8.5%
Mage                    4.18    20      0.21    0.0001          20.1%
Elvish Scout            4.60    18      0.26    0.0003          66.8%
Random Number chosen was        376
Side: 1 Gold: 27 Unit we want: Elvish Scout
PRERECRUIT:, enemy Dark Adept:2 , enemy Revenant:1 , enemy Skeleton:1 , enemy faction:Undead , enemy gold:8 , 
enemy level3+:0 , enemy total-gold:83 , enemy unit-gold:75 , friendly Elder Wose:1 , friendly Elvish Fighter:2 , 
friendly Elvish Ranger:1 , friendly Mage:1 , friendly Wose:3 , friendly faction:Rebels , friendly gold:27 , 
friendly level3+:0 , friendly total-gold:225 , friendly unit-gold:198 , side:1 , terrain-forest:0.082 , 
terrain-mountain-hill:0.113 , terrain-water-swamp:0.164 , total-gold-ratio:0.731 , turn:13 , 
village-control-margin:0 , village-control-ratio:0.5 , village-enemy:8 , village-friendly:8 , village-neutral:0 ,

This situation occurs towards the end of a game that the Rebels are winning. Note that total-gold-ratio (the ratio between the sum of gold + the value of all units on each side) is 0.731, so it's heavily in the Rebels' favor. The ML AI sees an Elvish Scout as being the best choice in this situation with a Mage in second place. The Elvish Scout is probably favored because the game is likely to be won rapidly and only a fast unit will be able to reach the enemy or reach a village fast enough to add to its "experience points + village capture" metric.

The difference with the Pure AI is Weighted Random then does the following:

  1. It takes every metric/cost value and raises it to the sixth power. Why? We want to magnify the differences. In this example 0.26/0.21 = 1.23, but (0.26**6)/(0.21**6) = 3.60.
  2. We then randomly choose a unit with a probability proportional to this weighted value which, in this case, was an Elvish Scout.

How to train your own ML Recruiter

Known issues

Bugs

  1. Haven't added new Waffles files to Visual C++, so it won't compile under VC++. I need some help with this.

Current Limitations

  1. Only tested on two-player multiplayer games. Doesn't work when there are more than two leaders on the map.
  2. Works optimally on the following two-player maps (trained on these)
    1. Weldyn Channel
    2. The Freelands
    3. Den of Onis
    4. Fallenstar Lake
  3. Tested on all other two-player maps and runs nearly as well except it crashes on the following:
    • Aethermaw, Hornshark Island, Dark Forecast, Sablestone Delta, Elensefar Courtyard, Silverhead Crossing
  4. The ML Recruiter is currently blind to certain special abilities. It doesn't see the benefits of poisoning or slowing an enemy or healing a friendly unit, consequently it undervalues units that poison, slow, or heal. It is also blind to the benefits of leadership.

ML Recruiter development roadmap

  • ML Recruiter 0.1: Initial drop
  • ML Recruiter 0.1.1: Minor retraining of the model
  • ML Recruiter 0.2:
    • Logging messages changed from print statements to using lg::log_domain.
    • Now have an explicit debug mode by running with --log-debug=ai/ml.
    • Use local ai, which allows ML Recruiter to play against itself. Previously could only have ML Recruiter on one side.
    • Some work on ML recruiting model (i.e. the core logic). Experimented with different training strategies, but features unchanged from 0.1.