SummerOfCodeProposal gabba

From The Battle for Wesnoth Wiki
Revision as of 03:32, 21 March 2013 by Esr (talk | contribs) (Hide historical references to Subversion from the wiki search function)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Introduction

IRC: gabm

Wesnoth forums and GNA!: gabba

Languages spoken: French, English, Spanish

I've been itching for a long time to contribute something to one of my favorite open-source games, and this program seems like the perfect opportunity. Furthermore, I intend to validate this summer's work as an internship with my university.

Experience

I've dabbled with code since the age of 15 (more or less), where I copied my first Basic program line by line from Science&Vie magazine. It contained a bug, so that was also my first bug fix! I self-learned Java when I was 18 by going through most of the Java Tutorial.

However my first serious programming projects were at my university, where we have a large (5-person, minimum 10 hours a week) team project every year. These projects were:

  • 1st year: assemble a robot complete with motor, wheels, microcontroller and sensors, and program it to follow lines traced on the ground and detect and avoid obstacles. This was developed in C/C++ under linux, and cross-compiled for the microcontroller. We used Eclipse as our IDE.
  • 2nd year: starting from the given specifications, program a 3D Pinball game and an editor to create different pinball tables. This time they threw everything they could at us, so we could learn to use different technologies and interoperate them efficiently. Therefore, we programmed both in C++ (for the backend) and Java (for the editor's interface), with JNI and JAWT to glue them together. We used XML to save the pinball tables (using the Xerces library), the Box2D engine for the physics simulation, OpenGL for graphics, FMOD for sound, and a couple smaller libraries for things like calculating the convex shell of a 3D model. We modeled our objects in 3d Studio MAX.

I was team leader on this project, as well as the third year one.

I could provide the source of that project to my mentor(s) if they ask for it, so they can see how I work. I would write a lot of that code differently (and hopefully better) today, but it still shows I can code something that works.

  • 3rd year: the idea this time (We're in the last stages of this project as I write) was to take the specifications from last year and adapt them to answer an imaginary client's (represented by a teacher) needs. The main change was adding multiplayer gameplay. We chose to implement a client-server architecture where the physics would be calculated on the server; all players share a dome-shaped table and compete for points. This time again we had to use a mix of technologies: the server runs under linux, and we have a Windows XP client and a Windows Mobile 5 (Pocket PC) client. We coded in C++ and C# this time, and we used the RakNet (network), Bullet (physics) and Irrlicht (graphics) libraries.

As you can see, I did most of my software development in teams of two to five people. Which doesn't mean I can't work alone, I'm very autonomous.

Open Source Involvement

I got involved mostly in two open source projects: TA Spring and Battle for Wesnoth.

TA Spring (http://spring.clan-sy.com): I became an active participant on the forums since the early days of the project. I contributed a lot of ideas, many of which were implemented; for instance the terrain map that determines regions of the map that hinder or accelerate unit movements. I also did one or two hacks to the source code and posted them on the forum, but I was not much involved with the coding.

Battle for Wesnoth: I was active for a while on the forums, and again I mainly contributed ideas, especially in the domain of AMLA (units improving beyond their maximum XP level).

PATCH: I recently commited a small patch to add a new WML attribute (a tasked picked from the EasyCoding page). It's smallish and I'm not sure it shows anything about my abilities, but at least I managed to add a small feature in a clean way :). Please see https://gna.org/patch/index.php?1162

Irrlicht, Bullet: This is very light involvement, but I try to contribute back whatever I may need to fix in those open-source engines.

Gaming experience

I'm very passionate about games, and I like collecting and playing all the classic games in every genre, be they recent or old. I also love playing and following open-source games. I love strategy games and roguelikes. As far as opponents go, I prefer playing against an AI, however I regularly have LAN parties with my friends (and yes, we play Wesnoth!). I also play online from time to time.

In games, I like a good mix of story and gameplay, but I'm certainly better at creating the gameplay part than the story part of a game.

Wesnoth playing experience

I've played the main campaign (Heir to the Throne) several times, as well as a few others. As said above I regularly have a game of multiplayer Wesnoth with some close friends.

Communication skills

Language

French is my mother tongue, but I'm perfectly fluent in English, both written and spoken (even though there's room for improvement). And I can hold a conversation in Spanish.

Social skills

I'm familiar with forums and the flame wars that can erupt; I generally keep my cool and try to be diplomatic, until people calm down and start reasoning instead of shouting. I give and take constructive advice, but I don't follow every random suggestion that's thrown at me either.

Project

I selected the "Reorganizing the savegame format" idea. It has a pretty clear task definition, and I intend to do the work as described.

I love the idea of multiplayer campaigns, and working on the backend to make it happen seems like a perfect way to contribute. Besides, it covers an area of programmming I'm comfortable with: I like designing an architecture, working with files, defining protocols, and so on.

Timeline

revised april 14th, 2009

I'll be working on this project from May 1st (or even possibly April 26th) to August 17st, for a total of 15 weeks. I may take one week off for vacations, but the exact date is to be determined. I'll spend 35 hours a week on the project, most of the time working during "business hours". To validate my internship, my university requires me to work at least 35 hours a week, so I won't be cheating on this.

Here's the breakdown by week of how I plan the 15 weeks of this project:

Week 1-3: Analysis and documentation of current save code. Getting in touch with all involved.

  • Deliverable: documented list of all save attributes

Weeks 4-6: Design and prototypes; mid-term evaluation

  • Deliverable: detailed savegame format, and prototype of C++ refactoring

Weeks 7-10: Implement Single-player using the new save code

  • Deliverable: saving/loading/replaying campaign scenarios works using the new save code

Weeks 11-14: Implement Multi-player using the new save code, and address side issues.

  • Deliverable: multiplayer campaigns and scenarios work using the new save code

Week 15: Wrap up, write any missing documentation (but I'll document as I go).

  • Deliverable: final code and documentation

Technical Details

« Do not be hasty » - Treebeard


I must begin with a disclaimer: I really dislike hasty design decisions, and have seen cause them many problems. Therefore although I may be proposing some changes below, I'll spend the first few weeks of the summer of code re-evaluating them, and talking to whoever might be affected by the changes.

To start with, here's my answers to the questions asked on the SoC_Ideas_Savegame page:

Question 1 – Adapter vs Using raw persistence data

The approach taken in config_adapter.cpp is similar to the one of the Adapter design pattern: shield other objects from a class (config) that doesn't present the wished-for interface, by providing a « middleman » that talks to both. This has maintenance benefits, as the config class or the savegame format can change without directly affecting other classes, and other classes can change without putting pressure on config or the save format; only config_adapter.cpp needs to change. However the pattern Adapter is mainly used when one doesn't have control over the interface of the target class, which is not the case here, as the config class is part of wesnoth.


From the encapsulation point of view, this design obviously forces participating objects (gamemap, unit_map, gamestatus) to provide getter/setter methods so they can receive or give information. Good encapsulation principles say that you should rather tell the object to do the work, providing information as needed - preferably in the shape of an object. This is the approach taken by game_state, which populates itself using information from a config object.


We must notice that in both cases the config class is not much encapsulated; however this is very hard to avoid as config is essentially a "data bag" with the contents of a WML tag.


I'd personally favor the approach taken by game_state, especially it already seems to be used in most places. However... "do not be hasty". YogiHH states that the code is currently very hard to maintain, and some more insight into the problem would be needed. In particular, I'd like to analyse further which code is subject to changing often and might warrant the use of the Adapter or another pattern that helps reducing the impact of change.


Question 2 – Removing config_adapter.cpp

So, let's say we want to remove config_adapter.cpp/hpp and move it's code elsewhere.


config_adapter.cpp [which is not object-oriented at all] offers 4 methods, 3 being rather minor utility functions to extract some info from a config object, and the major one being get_player_info, which has... a lot of parameters. The major IN parameters are a config object and the gamestate object (which is not const for some reason), and there are various OUT parameters, like a vector of team objects.


Overall, config_adapter.cpp doesn't seem to play a major role:

  • play_controller::init uses get_player_info(), get_first_human_team() and get_unique_saveid()
  • some classes use get_theme()

get_theme is small and would be easy to move (probably into game_state). For the three other methods I see two alternatives:


  • Move them into play_controller and let that class handle the raw config info. It's the same approach as with game_state, however this will create more maintenance problems, by adding one more class that must change if the savegame format or even the config class changes. Overall, exposing more of the code to the persistence framework seems a bad idea.
  • Have either game_state or gamestatus become the definite boundary with the save code: we could set the general principle that any class interested to get info from a savegame config object must do so through game_state or gamestatus. Since a question below deals with a merger, these may actually be the same class. With this approach, get_player_info(), get_first_human_team() and get_unique_saveid() would find a new home (hopefully in a nicely refactored way) in this new boundary class.

Question 3 - The player_info struct and the team class

The following fields are common to player_info and team_info (the struct that holds all the data of the team class):

  • name (Stores the current_player name)
  • gold (Amount of gold the player has saved)
  • can_recruit (Units the player has the ability to recruit)

Only those are specific to player_info:

  • gold_add (“yes” or “no”, controls carryover gold)
  • available_units (Units the player may recall)

I can see no apparent reason of duplicating the first three variables, therefore it would make sense to merge player_info into team_info, and control access to all its fields through the team class. gold_add and available_units would be added to team_info, and would simply go unused if we're dealing with an AI team (at least for now, who knows what multiplayer campaign designers have in mind).

There are few references to player_info compared to those to the team class, and so the change would be of moderate difficulty.

An additional remark: since AI players and Human players have a lot of things in common, but also some variables and methods that are specific to them (examples: recruitment_pattern is an AI thing, available_units is Human-only as far as I know), we could consider creating subclasses of team called team_ai and team_human. That would allow us to keep in team only fields that are actually used, and to push more specific ones into one of the subclasses. Since the team class is used all over the source code, the change might just be too much, but I wanted to throw the idea out there.

Question 4 - the game_state class and the gamestatus class

Both these classes are heavily used all over the source code; however they have a kind of one-way relationship, since gamestatus contains a game_state, whereas game_state doesn't reference gamestatus. Both classes have read/write access to a config object, and so they can directly access the save file.

game_state holds a map of player_info, and contains most information from the savegame:

  • general game information: game type (multiplayer, campaign, etc), scenario, and so on
  • three config objects:
    • replay_data: contains info to get the game back to the middle of a turn
    • starting_pos: starting position, useful for multiplayer where it might change relative to the scenario
    • snapshot: current state of the game, contains the most information

gamestatus contains a game_state, a vector of team objects, as well as info about the game flow and methods that affect it: random selection of the starting time of day, current turn, etc.

As a side note, gamestatus.cpp also contains quite a few savegame-related global methods; I was about to propose to move them into some class, but I see that YogiHH is currently moving them to savegame.cpp. If he keeps going like that, there won't be anything left to do by the time the SoC begins officially!

A merger between gamestatus and game_state sounds like a good idea, since their role is quite similar; however, unlike player_info and team it's not an easy merge, since it involves changing a lot of references. Furthermore there is no obvious direction for the merge: does game_state disappear into gamestatus, or the other way around, since both classes are used more or less at the same frequency?

Question 5 – WML candidates for redundancies

First of all, I'd like to say here that I don't like very much the idea of duplicating an attribute (together with its whole tag, in the case of units) just because it might change. What would be much better would be storing tags or attributes only if they have changed. If time allows and the primary objectives of this SoC are doing well (that is, reorganizing the savegame format so it's the same for any type of game – campaign, multiplayer, multiplayer campaign), I'd like to work on a fallback mechanism for config info, which would look roughly like that:

base unit definitions ← scenario/campaign files ← replay_start ← snapshot

Starting from the right, if a tag or attribute is not present in the current element, it would be seeked in the next element at the left. I see mainly two benefits to such a system: dramatically reduced savegame size, and improved readability; and the possibility of units updating all their attributes which haven't changed to match the new version of a scenario/campaign file.

In any case, doing the above supposes that we have a good, uniform structure in save files. So let's identify candidates for redundancy. Currently, there are redundant tags that may or may not map to the same object, as my colleague Euschn observed. However, with the merges discussed above, unnecessary duplicate info between C++ objects should mostly disappear; to keep this section simple, I'll stick to the WML side of things.

Mordante/SkeletonCrew proposed a general outline for the save format at http://www.wesnoth.org/wiki/User:SkeletonCrew#Savegame_format, and I'll be referring to his proposal from time to time.

root vs [snapshot]

The following attributes seem to be subject to change, and should still be duplicated:

  • label (I didn't witness it, but I heard it can be different in root and [snapshot])
  • completion
  • end_text_duration
  • next_underlying_unit_id

But it seems that these can't change, and could be written only once:

  • abbrev
  • campaign (single player only)
  • campaign_type
  • campaign_define
  • random_seed (also in common with replay_start)
  • difficulty
  • random_calls
  • underlying_unit_id
  • version – this one is also in common with [replay_start]. If all three variables are updated whenever the save file is written, it's a good candidate for merging.

We could also move the content of the root to a new [header] tag, as proposed by mordante. It might make handling that information on the C++ side more efficient by containing it in a small config object.

[replay_start] vs [snapshot]

[replay_start] contains few attributes in single player, but a lot of them in mp; The only one that seems redundant in single player is:

  • name

In multiplayer, those would have to be examined and discussed with the multiplayer community, to see if they could be moved to a common place instead of being duplicated:

  • experience_modifier
  • mp_countdown
  • mp_countdown_action_bonus
  • mp_countdown_init_time
  • mp_countdown_reservoir_time
  • mp_countdown_turn_bonus
  • mp_use_map_settings
  • mp_village_gold
  • random_seed
  • random_start_time (seems logical that it won't change)
  • version – see comment above, in root vs [snapshot]

[player] vs [snapshot.player]

My colleague mentions this distinction, but this has apparently been fixed by grantwu; in saves produced by the latest s­­v­­n version I can't find a [player] tag in root, only [snapshot.player] or [replay_start.player]

[snapshot.player] vs [replay_start.player]

Interestingly enough, in beginning-of-scenario saves (that are automatically done just after you win a campaign scenario), the [player] tag is in [replay_start], but if you save later in the scenario, the [player] tag is now in [snapshot]. Some uniformisation obviously wouldn't hurt here; but merging [player] into [side] sounds like an even better idea, see below.

[snapshot.player] vs [snapshot.side]

[snapshot.player] only appears in single player, while [snapshot.side] appears all the time and seems to store the same information in all game modes (but it has extra info in the case of an AI player). [snapshot.player] maps to player_info and [snapshot.side] maps to team_info in the team class.

Common attributes are:

  • save_id
  • gold

However the gold value is not the same between the two gold attributes. And those are unique to [snapshot.player]:

  • can_recruit
  • gold_add

[player] also contains [unit] tags, which would certainly benefit from a merge with those under [side]. I conclude that integrating [player] into [side] makes a lot of sense. However it might be a good idea to take the occasion and rationalize the way gold information is stored – all over the save file and not only in [side].

[replay_start.side] vs [snapshot.side]

The code obviously needs cleanup, but as far as I understand, when loading a savegame only the contents of the [snapshot] section are used, unless the player choses to review actions up to this point. In this last case, [replay_start] is used to set the initial situation, and the moves are then replayed using info in the [replay] section.

[replay_start.side] and [snapshot.side] contain the following duplicated variables:

always:

  • controller
  • fog (might change ?)
  • shroud (might change ?)
  • gold (has a different meaning in each case)
  • income (changes)
  • id (I heard it might change)
  • recruit (WML might change it)
  • team_name
  • user_team_name (changes)

multiplayer only:

  • allow_player
  • colour
  • current_player (changes – name of player whose turn it is)
  • share_maps
  • share_view
  • side (I suppose it tells whose side's turn it is, therefore it changes)
  • village_gold and other village tags

All tags that I didn't mark with “change” would probably we worth writing only once. However, as replays don't access the snapshot tag and vice-versa, the information would need to be moved to a new tag that is accessed from both – maybe a [teams] tag in the root, containing a number of [side] tags?

[event]

Silene (cf. IRC logs) says that events are consumed as they happen, and therefore all these tags are mutable. They are duplicated between [snapshot] and [replay_start], but there's not much we can do about this unless we take a broader strike at the problem and start thinking about a “fallback” mechanism, as I'm proposing at the beginning of this section

Fixing gold_*

The gold values are now all over the place, which is probably confusing for everybody. The values of gold to track at a given moment are:

  • minimum starting gold defined by a scenario
    • gold in [snapshot.side] and [replay_start]
  • carryover gold from a previous scenario (campaigns), which is then compared to the minimum starting gold to determine the real starting gold in [snapshot.side]
    • [player].gold is the carryover gold, and there's what looks like the real starting gold already computed in [snapshot.side].start_gold
  • current gold
    • in [snapshot.side]

We could organize this into the following variables (for instance):

  • gold_current – the current gold at this point in time, duplicated as needed
  • gold_carryover – gold accumulated from last scenario, 0 if there was no last scenario
  • gold_min_start – the minimum starting gold determined by the scenario
  • gold_carryover_percent – the percentage of carryover gold a team is allowed to keep

The real starting gold would not be stored anywhere, but would always be recalculated from gold_carryover, gold_min_start and gold_carryover_percent. Please see the outline below for the proposed distribution of these variables.

Proposals

C++ changes

In summary, the C++ changes I envision for the moment are those suggested by the questions above, namely:

  • Getting rid of config_adapter.cpp
  • Merging player_info into team.team_info
  • Merging config_adapter.cpp, game_state and gamestatus into one class
  • Possibly moving AI and Human-specific info into subclasses of the team class

WML save format

I plan on reviewing and expanding http://www.wesnoth.org/wiki/Summer_of_Code:SavegameWMLMapping by making an exhaustive list of all tags with their role, the situation they appear in, and relevant duplicate tags. Once submitted to proper peer review, this should allow me to make a definite proposal for the new savegame format.

In the meanwhile however, here's what I have in mind for the general outline, taking as a base mordante's proposal. Note that his proposal includes using the savegame as the way of transmitting scenarios to other players in mp; I've removed some things related to this purpose, since I haven't had the time to study that aspect of the problem.


Proposed WML Savegame Structure


  • [header] all the basic info which is at the moment at the root of the file, and won't find a home in one of the following sections. Among others:
    • version attribute
  • [summary] this section isn't really needed for a savegame but it contains the info the addon server needs to know about (not sure what's needed.)
  • [base] contains info that will never change in the course of the scenario
    • [multiplayer] contains multi player settings that are not subject to change – not saved in SP.
    • [era] this section contains the era data for MP, section doesn't exist in SP.
    • [scenario]
      • in SP, contains, among other things:
        • scenario_id
        • gold_min_start & gold_carryover_percent
      • in MP, additionally contains the scenario data.
    • [campaign] contains campaign settings that won't change, such as the attributes: campaign, campaign_type, campaign_define)
    • [teams]
      • several [team] tags (replaces [side]), each matching a team C++ object
        • gold_carryover attribute
        • [unit] tags, that contains info about carryover units that won't change, like portraits, sounds...
        • [ai] tag, one or several, containing any AI data for this team
  • [start] section with all info regarding the start of scenario (replaces [replay_start]), that might change in the course of the scenario
    • map attribute
    • attributes such as the current turn, time of day, number of turns, objectives, and so on.
    • [teams] current state of the players, contains N [team] tags, each matching a team C++ object
      • [team]
        • gold_current attribute
        • recall_list attribute
        • recruit_list attribute
        • [unit] tags, that contain unit info that changes over the course of a scenario, like XP, and stuff WML can change, like attacks (cf. the Storm Trident in Bay of Pearls)
  • [snapshot] section with the info about the current state of the game – contains the exact same info as [start], but in its current state.
  • [replay] same as current
  • [statistics] same as current


Notes:

  1. I don't know if a scenario or era can change in the course of a MP scenario; if so these tags would need to be in [start]/[snapshot] instead of [base]
  2. [start]/[snapshot] would map well to the new merged game_state object, but I'm not sure yet what [base] would map to.
  3. This whole structure is designed assuming the basic philosophy of duplicating everything that might change; it would be much more elegant and simple if we decided to store only information that changed compared to the base value. Early into the Summer of Code, I'll design an alternate outline using that method, so we can look at both alternatives.


To conclude, with the experience of last year's and this year's project, I'm confident in my ability to learn to swim by being thrown in the water. As I'll be working on this full-time, I'll catch up quickly with what I need to know to start working. I have allocated time for this in my timeline.

I'll be using the GoF design patterns (with UML diagrams as needed), Boost smart pointers, and good design practices to make sure I submit structured, clean and efficient code. Also, I suppose that my experience with serializing data to write to XML or transmit across the network will come in handy when dealing with WML.

My expectations

I want to acquire more software development experience, and the satisfaction to have contributed to an open-source project, which happens to be a game that I love. Furthermore it will be the perfect jump-start to keep contributing to Wesnoth: once I've dived deep into the codebase, I'll find it much easier to implement other features or keep improving my GSOC work.

I've been lurking and posting on the Wesnoth forums for a while, so I already consider myself a part of the Wesnoth community :).

Practical considerations

I use sub­­version extensively for all my projects. C++ is my main programming language, which I used in all three of my major projects. I know Java rather well too. However I never used Python, but I delved in the source of some programs written in Python, such as Wrye Bash, a modding tool for Oblivion.

As far as IDEs go, my favorite C++ development tool is probably Netbeans, but I can be just as efficient under Visual Studio or Eclipse.

Then there are the peripheral tools that I use most of the time: Doxygen for documentation, makefiles or batch files/bash scripts, property sheets under Visual Studio, Dia for UML diagrams, and so on and so forth. I usually try to find the best open-source tool for the job.

This page was last edited on 21 March 2013, at 03:32.