Difference between revisions of "FormulaAIandDynamicScripting"

From The Battle for Wesnoth Wiki
(Overview)
m (Some major changes to coincide with official GSoC Application)
Line 3: Line 3:
 
I am currently a full-time graduate student living in Colorado, US.  I've been coding since I was about 12, initially writing text based adventures for myself and my friends to play, moving on to RPGs using some primitive sprite based graphics and completed a mini RPG in high school, complete with graphics and music I wrote on my keyboard.
 
I am currently a full-time graduate student living in Colorado, US.  I've been coding since I was about 12, initially writing text based adventures for myself and my friends to play, moving on to RPGs using some primitive sprite based graphics and completed a mini RPG in high school, complete with graphics and music I wrote on my keyboard.
  
Since then, I have graduated from Pennsylvania State University with a bachelors degree in computer science (where I achieved a perfect 4.0 gpa) and entered a PhD graduate program for computer science focusing on AI research.  My research to this point has been focused on machine learning and evolutionary computing techniques for parameter optimization, but many of the techniques have a wide variety of applications.  I use C++ primarily for my research and have been a teaching assistant for our undergraduate C++ course.  
+
Since then, I have graduated from Pennsylvania State University with a bachelors degree in computer science and entered a PhD graduate program focusing on AI research.  I use C++ primarily for research and have been a teaching assistant for our undergraduate C++ course.  I have approximately 4/5 years of C++ experience and some Python experience.
  
 
===Contact Information===
 
===Contact Information===
Line 29: Line 29:
 
The rulebases will be implemented in FormulaAI while the reinforcement learning portion will be handled by C++. This accomplishes two goals: 1, the AI can easily be customized and extended without touching the C++ code and with no knowledge of the reinforcement learning process and 2, the learning process can easily be 'switched off', perhaps as a difficulty setting or to force strict adherence to a script.
 
The rulebases will be implemented in FormulaAI while the reinforcement learning portion will be handled by C++. This accomplishes two goals: 1, the AI can easily be customized and extended without touching the C++ code and with no knowledge of the reinforcement learning process and 2, the learning process can easily be 'switched off', perhaps as a difficulty setting or to force strict adherence to a script.
  
==Formula AI==
+
==Formula AI - Rulebases==
  
Formula AI will be used for designing the AI rulebases described below.  A rulebase can be thought of as a set of candidate moves written in Formula AI which accomplish some particular behavior or strategy.  Much of the work will be defining what behaviors/strategies are required and implementing the appropriate moves in Formula AI.  I believe there will be some back and forth between implementing rulebases in FormulaAI and extending Formula AI when additional functionality needs to be added. These two things will probably be a majority of the project.
+
Rulebases contain the FormulaAI rules the AI can use to form scripts. The rulebases developed for this project will form a repository of strategies and behaviors that designers can simply plug into a WML file to create a highly effective AI without ever touching FormulaAI or the underlying learning processes. Of course, if a designer desires a new strategy they can write custom FormulaAI rules.
  
==Rulebases==
+
I plan to involve the community in this portion of the project to identify the common strategies used by players and desired by scenario and MP bot designers. The how to play series will also serve as a useful guide. I foresee three categories of rulebases the designer can choose from to customize the AI for a particular scenario or deathmatch.
  
Think of a rulebase as a collection of moves which coordinate some type of behavior (i.e. assassinate opposing leader, defend castle with ZoC, etc.).  The moves in each rulebase are not necessarily the moves which are taken at any given point in a game but rather they create a pool of candidate moves the AI can choose from.  Exactly what move is chosen from this pool of candidate moves depends on an evaluation function which assigns probabilities to each move.  These probabilities can be altered by the AI itself (using dynamic scripting) as it plays through a game, thus providing the AI the ability to adapt to an opponents strategy.
+
* RECRUITMENT RULEBASES
  
There will be two types of rulebases: team rulebases and unit rulebases.  A team rulebase will contain formulas which more or less correspond to scenario objectives (or deathmatch, MP, etc.) such as 'Escort unit X to Hex Y'. These moves will be used to govern the overall behavior of the AI to win the round. Unit rulebases, cover unit specific behavior such as 'Hatred towards faction X', 'Hide and Ambush' or 'Healing unit'.  These rulebases can be applied to specific units or classes of units to allow them to use unit specific strategies and techniques.  This allows for an easily customizable AI for scenario designers, MP bots for co-op campaign play, death matches, etc.
+
These rulebases will cover strategies to recruit units. A default recruitment strategy will be provided that selects the best units for the desired scenario goal.
 +
 
 +
* TEAM RULEBASES
 +
 
 +
Team rulebases will govern team strategies and will supply the majority of rules the AI can use when creating a script. Developed rulebases cover the majority of scenario and MP objectives, such as 'Escort unit x to hex y', 'Assassinate enemy leader', etc.
 +
 
 +
* UNIT RULEBASES
 +
 
 +
Unit rulebases provide rules for unit specific behavior. Some of these rulebases will be associated to units by default, for instance a 'healing' rule for healing units, 'backstab' to thieves, etc.
 +
 
 +
A variety of rulebases to customize unit behavior to a scenario storyline will also be available, such as 'Hatred towards faction x' and can be applied to single units or groups.
  
 
For example, a scenario designer might want to create a scenario in which a group of orcs, goblins and ogres must escort an orcish leader across a map to hex 5,10.  The storyline might dictate that the goblins and ogres are only helping the orcs for a chance to kill elves, which the player has the ability to recruit.  The scenario designer could implement this quite easily in their cfg file for that scenario with something like  
 
For example, a scenario designer might want to create a scenario in which a group of orcs, goblins and ogres must escort an orcish leader across a map to hex 5,10.  The storyline might dictate that the goblins and ogres are only helping the orcs for a chance to kill elves, which the player has the ability to recruit.  The scenario designer could implement this quite easily in their cfg file for that scenario with something like  
Line 66: Line 76:
 
The AI designer can go deeper or shallower if necessary.  An adequate default AI with appropriate unit rulebases and a 'kill all' team rulebase will be the default if none are specified.  If the provided rulebases do not cover some specific behavior, the designer of course may implement his own rulebase by creating a custom formula script or by altering the evaluation functions of the existing rulebases.
 
The AI designer can go deeper or shallower if necessary.  An adequate default AI with appropriate unit rulebases and a 'kill all' team rulebase will be the default if none are specified.  If the provided rulebases do not cover some specific behavior, the designer of course may implement his own rulebase by creating a custom formula script or by altering the evaluation functions of the existing rulebases.
  
==Evaluation Functions==
+
== Adaptation and Learning - C++ ==
 
 
Along with the rulebases defined above also comes evaluation functions appropriate to each rulebase.  These will be included with the formula scripts for each rulebase and will be transparent to the designer unless he or she wishes to modify them to fine tune the behavior.  These evaluation functions will be used to determine the probability of a particular move.
 
 
 
I expect to have two evaluation functions, one for evaluating a move based on the overall impact on the team goal (this one will come from the team formula) and another on the unit level (from the unit formulas).  Both evaluation functions will be evaluated on the candidate moves and a combination of the results will be used to determine the final probability of a move being chosen (This is the method used in Dynamic Scripting).
 
 
 
==Dynamic Scripting==
 
  
Finally, I would like to apply some machine learning concepts to the AI, specifically online learning.  Online learning allows the AI to adapt it's strategy to fix holes being exploited by the human player and over all to perform more intelligently, giving the player a feeling he or she is playing against a thinking, clever opponent.
+
The C++ portion of the project allows the AI to learn and adapt. Once the candidate moves are determined, the AI will rank the moves based on evaluation functions. The actual evaluation functions will be written in Formula AI, the C++ code should never have to be touch for customization or extension purposes.
  
This can be accomplished by Dynamic Scripting, a proven (and relatively cutting edge) technique.  Actually, everything described thus far is the general dynamic scripting as I would apply it to Wesnoth. The adaptability comes into play with the following addition.
+
There will be two evaluation functions: A team evaluation function, in which the impact of each move on the overall team strategy is evaluated and a unit evaluation function, in which the impact of a move on unit involved is evaluated. The results these functions determine the final evaluation of a move. Once all moves are evaluated, the script is formed based on these evaluations (i.e. best moves first).
  
As described above, each unit will have some probability associated with it assigned by the evaluation functions. The evaluation functions themselves are defined in the formula scripts and as such cannot be modified directly by the AI (at least for now). However we can assign a weight (initially set to 1) to each probability and allow the AI to modify this weight as it determines whether moves are successful or not. In this way, we give the AI ability to adapt it's techniques and an easy way to turn the learning portion off (set the weights to 1, don't allow change) for experimentation and perhaps adjust the difficulty setting.
+
At the end of turn, the success of each move is used to adjust a weight associated with that move. The weights are incorporated into the evaluation of moves during the next turn (e.g. weight * (unit_eval(formula) + team_eval(formula))). In this way, the AI can learn from it's mistakes and exploit holes in an opponent's strategy found by successful moves.
  
 
==Related Papers==
 
==Related Papers==

Revision as of 02:19, 26 March 2008

About Me

I am currently a full-time graduate student living in Colorado, US. I've been coding since I was about 12, initially writing text based adventures for myself and my friends to play, moving on to RPGs using some primitive sprite based graphics and completed a mini RPG in high school, complete with graphics and music I wrote on my keyboard.

Since then, I have graduated from Pennsylvania State University with a bachelors degree in computer science and entered a PhD graduate program focusing on AI research. I use C++ primarily for research and have been a teaching assistant for our undergraduate C++ course. I have approximately 4/5 years of C++ experience and some Python experience.

Contact Information

  • IRC: barbarianhero
  • Forum id: rende
  • Gna! username: dhains
  • preferred email: dhains__A!T__ gmail.com

I'll be adding more to this page as I have the time. Please feel free to contact me if you have any questions.

Overview

Designing an AI for playing games is a challenge. Techniques such as minimax and alpha-beta pruning are useful in games such as chess or backgammon, but the complexity of Wesnoth's gamestate space precludes many of these approaches.

A common method to overcome this problem is to use scripting. Manually designed rules are created to determine the course of action, for instance if the AI should attack or defend. While these methods are effective it is generally not enough to provide challenging play, especially to experienced players. Players can 'outsmart' a scripted AI simply by exploiting the predictability inherent to scripting.

An ideal AI is one which exhibits human-like behavoir. One which can adapt its strategy to cover holes exploited by players. One which has the ability of surprise, to make the player feel as if he is playing a thinking, cunning opponent instead of just 'trying to beat the computer'. Such an AI is not a pipe dream and even more to the point, is quite feasible from a technical standpoint.

This is a proposal to implement such an AI. I propose to use dynamic scripting to combine manually designed rulebases and online learning to create a customizable, extensible and adaptive AI. his type of dynamic planning has already proven successful in creating adaptive, formidable A.I. in games such as F.E.A.R. and Neverwinter Nights [1,2].

The rulebases will be implemented in FormulaAI while the reinforcement learning portion will be handled by C++. This accomplishes two goals: 1, the AI can easily be customized and extended without touching the C++ code and with no knowledge of the reinforcement learning process and 2, the learning process can easily be 'switched off', perhaps as a difficulty setting or to force strict adherence to a script.

Formula AI - Rulebases

Rulebases contain the FormulaAI rules the AI can use to form scripts. The rulebases developed for this project will form a repository of strategies and behaviors that designers can simply plug into a WML file to create a highly effective AI without ever touching FormulaAI or the underlying learning processes. Of course, if a designer desires a new strategy they can write custom FormulaAI rules.

I plan to involve the community in this portion of the project to identify the common strategies used by players and desired by scenario and MP bot designers. The how to play series will also serve as a useful guide. I foresee three categories of rulebases the designer can choose from to customize the AI for a particular scenario or deathmatch.

  • RECRUITMENT RULEBASES

These rulebases will cover strategies to recruit units. A default recruitment strategy will be provided that selects the best units for the desired scenario goal.

  • TEAM RULEBASES

Team rulebases will govern team strategies and will supply the majority of rules the AI can use when creating a script. Developed rulebases cover the majority of scenario and MP objectives, such as 'Escort unit x to hex y', 'Assassinate enemy leader', etc.

  • UNIT RULEBASES

Unit rulebases provide rules for unit specific behavior. Some of these rulebases will be associated to units by default, for instance a 'healing' rule for healing units, 'backstab' to thieves, etc.

A variety of rulebases to customize unit behavior to a scenario storyline will also be available, such as 'Hatred towards faction x' and can be applied to single units or groups.

For example, a scenario designer might want to create a scenario in which a group of orcs, goblins and ogres must escort an orcish leader across a map to hex 5,10. The storyline might dictate that the goblins and ogres are only helping the orcs for a chance to kill elves, which the player has the ability to recruit. The scenario designer could implement this quite easily in their cfg file for that scenario with something like

  [ai]
     [team_formula]
       rulebase = "escort"
       parameters = "Orcish Leader", 5, 10
     [\team_formula]
     [unit_formula]
       apply_to_units = "goblins", "ogres"
       rulebase = "faction_hatred"
       parameters = "Elves"
     [\unit_faction]
  [\ai]

Of course the designer could make things a bit more complicated, by creating multiple team strategies associated to different units, e.g. suppose in the above example the AI also had a renegade faction of elvish rangers along for the ride, hellbent on destroying the human leader and don't really care about escorting the orcish leader. The designer might create an entirely new side, but if he or she wanted all the units on a single side, he might add the following to the above ai section.

     [unit_formula]
       apply_to_units = "Elvish Ranger", "Elvish Avenger"
       rulebase = "hide_and_ambush"  # Make elves stay hidden if possible until they attack
       [team_formula]  # This will override the "escort" team formula
         rulebase = "assassinate"
         parameters = "Human Leader"
        [\team_formula]
     [\unit_formula]

The AI designer can go deeper or shallower if necessary. An adequate default AI with appropriate unit rulebases and a 'kill all' team rulebase will be the default if none are specified. If the provided rulebases do not cover some specific behavior, the designer of course may implement his own rulebase by creating a custom formula script or by altering the evaluation functions of the existing rulebases.

Adaptation and Learning - C++

The C++ portion of the project allows the AI to learn and adapt. Once the candidate moves are determined, the AI will rank the moves based on evaluation functions. The actual evaluation functions will be written in Formula AI, the C++ code should never have to be touch for customization or extension purposes.

There will be two evaluation functions: A team evaluation function, in which the impact of each move on the overall team strategy is evaluated and a unit evaluation function, in which the impact of a move on unit involved is evaluated. The results these functions determine the final evaluation of a move. Once all moves are evaluated, the script is formed based on these evaluations (i.e. best moves first).

At the end of turn, the success of each move is used to adjust a weight associated with that move. The weights are incorporated into the evaluation of moves during the next turn (e.g. weight * (unit_eval(formula) + team_eval(formula))). In this way, the AI can learn from it's mistakes and exploit holes in an opponent's strategy found by successful moves.

Related Papers

Online Adaptation of Game Opponent AI in Simulation and in Practice, Spronk et al. [1]