GsocAIProjectProgress

GSOC AI Project Progress

Basic Idea

The basic idea is to develop libraries of FormulaAI moves based on eval functions. The C++ part will then be able to adjust weights on these moves to 'tweak' them in each scenario.

For example, consider a recruitment example:

Eval functions in FormulaAI will be provided which will associate a 'score' to each unit the AI can recruit. Based on these scores, the AI will recruit particular units. Along with each of these 'scores' is an associated weight, originally set to 1. Now suppose due fog of war or some other event the AI had no knowledge of certain players tactics or units and a certain unit which the Formula AI ranked very high begins to be massacred. Now the weight associated with that particular unit can be adjusted to some value < 1. This will decrease the chance of the AI choosing these units, even though the FormulaAI scores them fairly high.

The point of this learning system is that deterministic rules maybe good for many situations, but they cannot predict everything. In fact, experienced players may infer some of the deterministic rules and use them to their advantage. The learning system will give the AI the ability to tweak it's strategy on the fly to cope with player strategies that the deterministic rules cannot deal with. Recruitment is aless complex then a full fledged AI and the learning system may not be as necessary as it will be in a full fledged AI, but recruitment serves as a useful illustration.

Questions

Some of the main questions that come up from this idea:

) How does the FormulaAI provide the move and evals to the C++ learning system?
) How will the C++ learning system evaluate game state and know if it is in a similar situation to one it has been in before (i.e. how does it know which weights to associate to which moves)?
) How do we evaluate the success/failure of a move?
) How much should we adjust the weights based on this evaluation?

1.) For this I think the formulaAI can provide a list of <move,score> pairs. This provides a simple way for the AI to choose a move, simply pick the move with the highest associated score. This also makes it easy to apply weights, before choosing a particular move, it will apply the weight associated to each move to it's associated score. This leads to the second question, how will the AI know if a certain move is the same as one it has seen previously?

2.) The search space is too complex to keep any kind of state table. I think the best way to associate weights to moves is to 'categorize' the moves. For instance, in the recruitment example the categorization is easy, we simply categorize each recruitment move by the unit that move recruits. The C++ learning system can then keep a list of the categories and the associated weights.

For the more complex examples of non-recruitment moves, I think we can extend this idea of categorization. A very simple high level move categorization is: attack, move, defend. We might be able to further categorize attack moves by tuples (attacking unit, defending unit, terrain). Move moves might be categorized by the tuple (moving unit, terrain), etc.

I think this idea will allow us to associate weights with moves via these categories. There will always be a finite number of these tuples and we need not track the entire state of the space for the purpose of associating weights to moves.

The deterministic eval functions which associate the initial scores to moves are what evaluate the state space in terms of the overall team goals, etc. Even if a < 1 weight is associated with a move, if the eval functions determine the move to be the best move by a large margin, it still has a high chance of being selected. If this move is chosen and executed and is found to be successful, the associated weight will be increased. If the move is deemed to be a failure, the weight will be further decreased. In this way, we won't throw out very strong moves based on one failure. To offset the chance of a very strong move being chosen, it most have been found to be a failure more then once or perhaps an epic failure one time (i.e. loss of a key team member, something fairly catastrophic). This makes the learning system somewhat robust in terms of flukes in the RNG, etc.

3., 4.) This brings up another question(s), how to adjust the weights. We don't want to adjust the weights too fast, or simple RNG fluke could make it so that a strong, viable move is never chosen. We don't want to adjust to small, or the learning system will not be effective over a single scenario. We can call this weight step size alpha, and finding a good alpha value as well as functions to evaluate the success and failure of the moves are additional questions to be answered. In my AI experience, finding a good alpha value is more or less a trial and error process. We can make a close initial guess, but play testing and fine tuning will most likely be necessary to find the optimal alpha.

AI API: C++ <-> Formula AI Interface

Here we define the API for the AI (i.e. C++ <-> FormulaAI interface)