Difference between revisions of "SummerOfCodeProposal corn"

From The Battle for Wesnoth Wiki
(Goals)
m (Hide historical instance of "Subversion" from mediawiki search to avoid false positives.)
 
(7 intermediate revisions by one other user not shown)
Line 43: Line 43:
 
* vi(m)
 
* vi(m)
 
* linux
 
* linux
* subversion
+
* sub­­version
 
* gcc
 
* gcc
 
* grep, sed, awk, other standard shell hacker tools :)
 
* grep, sed, awk, other standard shell hacker tools :)
Line 106: Line 106:
  
 
==== Timeline ====
 
==== Timeline ====
* '''April 4th - April 10th'''
 
** Replicate the existing functionality of stats.pl with a homemade solution written in python + a graphing library. The library I have in mind is RRDTool but the google graphing API and matplotlib are both possible choices. This will form the basis for future work.
 
** Talk to my (potential) mentor about the improvements possible to the project. I plan on creating something akin to a report generator, where you can specify fields to compare and graph and it will generate a page for you.
 
* Approval waiting period
 
** I will continue to make improvements to my prototype.
 
* April 20th - April 30th
 
** Create the fully functional stats website, where users will select campaigns and have filters (operating system, difficulty level) to generate graphs. Performance of the website won't be considered yet, so all the graphs will be generated on demand from the huge statistics data set.
 
** Implement multiplayer statistics reporting. Change the upload source code (currently upload_logs.cpp) so that statistics on multiplayer games such as play time, average game size, and most played race are reported.
 
* May 1st - May 15th
 
** Add support for the newly reported multiplayer stats to the website.
 
** Optimize the stats webpage. This may mean generating graphs in a cronjob nightly and then displaying them statically during the day, or it may mean that the most accessed stats are put into a smaller SQL table for faster parsing.
 
* May 15th - June 5th
 
** Make the website spiffy. The website should be themed in the same style as the rest of the Wesnoth website. Add CSS to the site and make it look like it wasn't made in 1997.
 
** Allow developers to login in to the site and 'subscribe' to various kinds of reports. This will let campaign developers follow the stats of their campaign and do regression testing as they role out new changes and see how it affects gameplay. They will get nightly or weekly emails that will contain links to the reports they subscribed to.
 
* June 5th - June 30th
 
** Bug squashing and feature requests. The project will hopefully be accessible to content creators at this point, replacing the original stats.wesnoth.org. I will focus on bug squashing and feature requests.
 
** A cool idea would to be to log the geographic location of players and then display it using the Google Maps API. This will help those doing localization to figure out what languages should get the most priority.
 
* July 1st - July 20th
 
** Add a graph to show how many users are on the wesnoth server at a particular hour. This should be easy to do.
 
** Collect data for a map about the deaths in the map. This will help content creators figure out where the most activity is during gameplay - particularly useful for MP gameplay. It will be something along these lines: http://www.steampowered.com/status/tf2/tf2_stats.php - scroll to the bottom. I may start working on this during June if I am ahead of schedule.
 
* July 20th - End
 
** Bug squashing, documentation. I want to get this shipped/stable by the end of GSoC.
 
  
 
==== Technical Details ====
 
==== Technical Details ====
* The project will be split up into two seperate chunks: data acquisition and data presentation. Acquisition will be the parsing/representation of the upload logs into a useful format for my program. Presentation is the combination of report generation and the html interface (implemented via the Cheetah python template language).
 
  
* Both the frontend which will generate the website and the backend which will manage the data will be coded in Python, which I have substantial experience with and coded part of my previous instrumentation website in. The graphing library will be of the following or a combination: RRDTool, gnu plot, matplotlib, Google Graphing API. RRDTool and matplotlib are convenient because they both have python binds. RRDTool also has extremely nice looking graphics by default.
+
Frontend will be written in python with the turbogears application platform (for the nice templating language support). Databases for stats data and any user/configuration settings for the frontend will be in MySQL. Charting will be done using Google Charting API.  
  
* Improvements will be done to the clientside Wesnoth code, particularly in upload_logs.cpp . I want to implement multiplayer statistics gathering, which will be sent exactly the same way as the current stats are - logged into a file and sent via HTTP at client exit.
+
Changes will be made to the wesnoth client so that it keeps an array or a map where a list of dead units is kept for each tile. This data will be sent along with stats so that killgraphs can be generated. A program will also be written to take wesnoth maps, generate PNG files of them (if there isn't already such functionality) and color in each tile according to how many units were killed on it. This will be generated nightly and used for the killgraphs.
  
* Creating a kill graph to show where most units die on a map involves making or using a tool that generates thumbnails of maps and collecting data on kill locations. This data will be uploaded by the host of a multiplayer match as a special case of the regular logs. It can be uploaded for singleplayer campaigns as part of the existing logs.
+
The stats interface will scale up to millions of rows by operating on a randomly taken subset of the data grabbed nightly. If the subset turns out not to have data for a particular set of filters, the interface will revert back to the bigger database.
 
 
* Mailing reports will be handled by a cronjob. I plan on having this project be run on a Linux, Apache, MySQL, Python stack.
 
  
 
== Goals ==
 
== Goals ==
Line 145: Line 120:
 
! SUBPROJECT
 
! SUBPROJECT
 
! RESULT
 
! RESULT
 +
! PROGRESS
 
|-
 
|-
 
| <p style="color:red">MUST</p>
 
| <p style="color:red">MUST</p>
 
| Basic stats.wesnoth.org website
 
| Basic stats.wesnoth.org website
| Create a better formatted version of my prototype at http://cornmander.com:909
+
| Create a better formatted version of my prototype at http://cornmander.com:9090
 +
| Completely rewrote the code for generating graph pages. Everything is driven by a database and new graphs take less than a minute to create. I still need to create a page for adding new graphs. <- (edit 8/17/09): still haven't thought of new pie/line/bar graphs, but I have two ideas for hexmaps (aka killmaps) that would be useful: most captured villages, most traveled tiles.
 
|-
 
|-
 
| <p style="color:red">MUST</p>
 
| <p style="color:red">MUST</p>
 
| Add new stat windows to the website
 
| Add new stat windows to the website
| Created a recruited units window, a dead units window, and a unit by level breakdown
+
| Created a recruited units window, a dead units window, and a unit by level breakdown in addition to improving the things currently in the prototype. Add date ranges to all the charts on the site.
 +
| Incomplete. Added a killmap statistics page but I haven't added any more pie charts or line graphs. Bar graphs are still unsupported.
 
|-
 
|-
 
| <p style="color:red">MUST</p>
 
| <p style="color:red">MUST</p>
 
| Improve filter behavior
 
| Improve filter behavior
 
| Make the filters automatically remove choices with 0 entries based on currently selected filters
 
| Make the filters automatically remove choices with 0 entries based on currently selected filters
 +
| Not being done. This may be too performance-intensive to accomplish.
 +
|-
 +
| <p style="color:red">MUST</p>
 +
| Remove outlier data
 +
| Give statistics for the median 80%, leaving out the outlying 10% below and 10% above.
 +
| Still not done. A discussion that came up on IRC was to allow developers to set a wml variable designating their campaign as beta - not to be tracked. This should get rid of some outrageous values (-9000 gold at start turn, for example). A statistical approach to getting rid of outliers still needs to be created, however.
 +
|-
 +
| <p style="color:red">MUST</p>
 +
| Add scalability
 +
| Make the statistics presentation scalable to the millions of rows currently on stats.wesnoth.org. I plan to do this by creating a 10k random-row table subset of the data in a nightly cronjob and using this by default for all queries. If a query gives under CONSTANT_NUMBER results, the site will use the main database instead.
 +
| Done. I created a script that generates subset tables of 10k, 100k, and 1000k rows. I wrote a wrapper around my SQL queries that tries these tables first and checks the size of the result (using a library of evaluators). If the result is too small, it queries the next largest size table. Queries take less than a second on average.
 +
|-
 +
| <p style="color:red">MUST</p>
 +
| Kill maps
 +
| Create a section of stats that gives a breakdown by tile on a scenario of the units killed on that tile. The presentation will be a jscript-enhanced map where a user can hover over a tile and see a percentage breakdown of the units killed on that tile. In addition to website coding, significant changes must be made to stat collection.
 +
| Done. Added the code to the wesnoth client to send gzipped maps and kill event data. Also added code for a '--screenshot' parameter that allows a server without X11 to generate map screenshots using the wesnoth client. I implemented an extremely tiny WML parser to parse upload logs on my server, wrote a script to generate google maps tilesets from map screenshots (generated on server), and wrote a frontend that has filterable killmaps.
 +
|-
 +
| <p style="color:red">MUST</p>
 +
| CSS and Design
 +
| Integrate the look of the stats page with the rest of wesnoth.org . Make the website user-friendly
 +
| Done. It was pretty simple - I just named my div tags according to the wesnoth.org site and imported the stylesheet.
 +
|-
 +
| <p style="color:red">MUST</p>
 +
| Email subscription to charts
 +
| Create a login system so that users can subscribe to particular charts, and receive daily emails for comparison. This will be useful for basic AI regression testing, ex: run 10 different campaigns with an AI during a particular day and compare with the charts of the previous day
 +
| Not done.
 +
|-
 +
| <p style="color:green">GOOD</p>
 +
| Geographic data
 +
| Not done.
 +
|-
 +
| <p style="color:green">GOOD</p>
 +
| Get MP data
 +
| Partially complete. MP logging is enabled for AI vs AI matches but the log format doesn't have particularly useful stats. I still need to add more log data so that useful graphs can be made (faction win percentage, average end turn).
 +
|-
 +
| <p style="color:green">GOOD</p>
 +
| GUI or scripting interface for hooking in new stat pages
 +
| Create an interface for looking at new stats. User will have a choice of several graph types and what data sets to use for x and y axes.
 +
| Done. This is the database driven graph stuff.
 
|-
 
|-
 
|}
 
|}
 +
 +
Future work that I will do on this project will be focused on improving statistical/graphical analysis of AI vs. AI games.
  
 
==== What I expect to get from this Project ====
 
==== What I expect to get from this Project ====
Line 169: Line 188:
  
 
==== Familiarity with Tools ====
 
==== Familiarity with Tools ====
* Subversion - I know how to use svn, and I run it on my own server as well.
+
* Sub&shy;&shy;version - I know how to use s&shy;&shy;v&shy;&shy;n, and I run it on my own server as well.
 
* C++ - I know C and Java very well, but I don't know C++. I think I can learn it over the course of a week or two, however.
 
* C++ - I know C and Java very well, but I don't know C++. I think I can learn it over the course of a week or two, however.
 
* Python - I know python very well.
 
* Python - I know python very well.
Line 175: Line 194:
  
 
==== Development Tools ====
 
==== Development Tools ====
I use vi, xterm, svn, and GDB. I dislike working in IDEs because of the overhead and effort of getting your project set up properly in them.
+
I use vi, xterm, s&shy;&shy;v&shy;&shy;n, and GDB. I dislike working in IDEs because of the overhead and effort of getting your project set up properly in them.
  
 
==== Fluent Programming Languages ====
 
==== Fluent Programming Languages ====

Latest revision as of 00:17, 21 March 2013

Introduction

Hi, I am Greg Shikhman and I'd like to work on and improving stats.wesnoth.org.

Preferred Email

cornmander@cornmander.com


Nicknames

  • IRC - corn
  • Wesnoth Forums - cornmander

Why I want to participate

Working on Wesnoth for GSoC would be my first experience working on an large and public open source project. GSoC is a structured way (similar to an internship at a company) for me to get into the development of this project.

Studies

I am a high school senior who will be attending an accredited university this fall. This section will be more definite when I decide where I want to go :)

Patches

https://gna.org/patch/?1153 https://gna.org/patch/?1149

1 or 2 more patches to come related to bug #13094.

Experience

Last summer I had a full time summer internship at Morgan Stanley, the financial company. My primary accomplishment was the creation and maintenance of a statistics interface for their networking hardware. I created a system of scripts that handled statistics acquisition, and I wrote a java servlet that showed these statistics, with nice looking graphs via RRDTool. The tool was put into production shortly after my intership ended.

A smaller task that I also handled was the creation of a real-time packet rewrite/editing script on the network hardware, to support and upgrade legacy software.

Reference is available on request.

As part of a class, I have also created a real time rasterizer, with model loading, animation, and texture mapping. A scripting language for inputting animation sequences was also developed as part of the project. Source code, along with instructions for compiling are available at http://cornmander.com/dragonballx.tar.gz .

I also run my own server at http://cornmander.com . It is a colocated Gentoo Linux box. I have working sysadmin experience.

Programs / Software

  • python
  • LaTeX
  • GNU make
  • gdb
  • vi(m)
  • linux
  • sub­­version
  • gcc
  • grep, sed, awk, other standard shell hacker tools :)
  • bison, flex
  • libraries: opengl, sdl, opencv
  • languages: c, python, php, java, latex, html, tcl

Team Environment Experience

My rasterizer was created as part of a team project. I managed and handed out tasks to my teammates, and tried to keep them updating their TODOs.

My internship at Morgan Stanley was done under the supervision of my manager. Although I was working on my project alone, I took suggestions and feature requests from the end users.


Open Source

Involvement

I've reported bugs to the Gentoo Linux project, and I am credited for reporting a minor security bug: http://www.derkeiler.com/Mailing-Lists/Full-Disclosure/2007-08/msg00364.html .

I follow the development of ioquake and mplayer, but I haven't had a chance to contribute to either project. I may try to enter the ffmpeg (an integral part of mplayer) GSoC process next year.

Gaming Experience

I have been a PC gamer since late elementary school. I am a hardcore gamer primarily interested in FPS games - this is what sparked my interest in working on cutting edge graphics tech like rasterization. My current addiction is TF2. I also appreciate RPG games. I played Wesnoth for a few weeks with my friends after the Wesnoth 1.5 release, and I will start playing again now.

What type of Gamer

I used to play video games for a few hours per week, but school has cut it down to 2-4 per week. Nevertheless, I'd categorise myself as a hardcore gamer.

Preferred Opponents

I like playing versus friends, or at the very least versus people. AI isn't good enough to adapt and get better as you play, and there is always the joy of rubbing in your victory to the face of a human opponent.

Story or Gameplay?

Gameplay is king. I am used to playing multiplayer games, and the game mechanics are much more important than plot are backstory. Nevertheless, I do enjoy reading a good story or being a part of it in a single player campaign. Co-op support is an interesting idea.

Have I played Wesnoth

I played 1.5 multiplayer for a few weeks, and played through the Heir to the Throne campaign.

Communications

English

I am a native english speaker, and I am extremely fluent in Russian as well, although I can't read or write it.

Player Interaction

I ignore trolls, and I haven't been involved in player communities for a while. I lurk on forums, but I talk on IRC.

Constructive Advice

I take criticism well, but sometimes I am impatient when I don't see the point to a suggestion/requirement.

Receive Advice?

My priority is getting the job done. I don't take criticism personally.

Sorting out Criticism

The people I work with are typically more experienced and smarter than I am. Although I have the urge to question a suggestion, I'll never dismiss a suggestion without giving it some consideration.

Project

Which Project

I want to improve the stats server.

Why

I have existing experience with displaying stats, and the freedom of tools and programming languages should make it easy for me to hit the ground running.

Timeline

Technical Details

Frontend will be written in python with the turbogears application platform (for the nice templating language support). Databases for stats data and any user/configuration settings for the frontend will be in MySQL. Charting will be done using Google Charting API.

Changes will be made to the wesnoth client so that it keeps an array or a map where a list of dead units is kept for each tile. This data will be sent along with stats so that killgraphs can be generated. A program will also be written to take wesnoth maps, generate PNG files of them (if there isn't already such functionality) and color in each tile according to how many units were killed on it. This will be generated nightly and used for the killgraphs.

The stats interface will scale up to millions of rows by operating on a randomly taken subset of the data grabbed nightly. If the subset turns out not to have data for a particular set of filters, the interface will revert back to the bigger database.

Goals

PRIORITY SUBPROJECT RESULT PROGRESS

MUST

Basic stats.wesnoth.org website Create a better formatted version of my prototype at http://cornmander.com:9090 Completely rewrote the code for generating graph pages. Everything is driven by a database and new graphs take less than a minute to create. I still need to create a page for adding new graphs. <- (edit 8/17/09): still haven't thought of new pie/line/bar graphs, but I have two ideas for hexmaps (aka killmaps) that would be useful: most captured villages, most traveled tiles.

MUST

Add new stat windows to the website Created a recruited units window, a dead units window, and a unit by level breakdown in addition to improving the things currently in the prototype. Add date ranges to all the charts on the site. Incomplete. Added a killmap statistics page but I haven't added any more pie charts or line graphs. Bar graphs are still unsupported.

MUST

Improve filter behavior Make the filters automatically remove choices with 0 entries based on currently selected filters Not being done. This may be too performance-intensive to accomplish.

MUST

Remove outlier data Give statistics for the median 80%, leaving out the outlying 10% below and 10% above. Still not done. A discussion that came up on IRC was to allow developers to set a wml variable designating their campaign as beta - not to be tracked. This should get rid of some outrageous values (-9000 gold at start turn, for example). A statistical approach to getting rid of outliers still needs to be created, however.

MUST

Add scalability Make the statistics presentation scalable to the millions of rows currently on stats.wesnoth.org. I plan to do this by creating a 10k random-row table subset of the data in a nightly cronjob and using this by default for all queries. If a query gives under CONSTANT_NUMBER results, the site will use the main database instead. Done. I created a script that generates subset tables of 10k, 100k, and 1000k rows. I wrote a wrapper around my SQL queries that tries these tables first and checks the size of the result (using a library of evaluators). If the result is too small, it queries the next largest size table. Queries take less than a second on average.

MUST

Kill maps Create a section of stats that gives a breakdown by tile on a scenario of the units killed on that tile. The presentation will be a jscript-enhanced map where a user can hover over a tile and see a percentage breakdown of the units killed on that tile. In addition to website coding, significant changes must be made to stat collection. Done. Added the code to the wesnoth client to send gzipped maps and kill event data. Also added code for a '--screenshot' parameter that allows a server without X11 to generate map screenshots using the wesnoth client. I implemented an extremely tiny WML parser to parse upload logs on my server, wrote a script to generate google maps tilesets from map screenshots (generated on server), and wrote a frontend that has filterable killmaps.

MUST

CSS and Design Integrate the look of the stats page with the rest of wesnoth.org . Make the website user-friendly Done. It was pretty simple - I just named my div tags according to the wesnoth.org site and imported the stylesheet.

MUST

Email subscription to charts Create a login system so that users can subscribe to particular charts, and receive daily emails for comparison. This will be useful for basic AI regression testing, ex: run 10 different campaigns with an AI during a particular day and compare with the charts of the previous day Not done.

GOOD

Geographic data Not done.

GOOD

Get MP data Partially complete. MP logging is enabled for AI vs AI matches but the log format doesn't have particularly useful stats. I still need to add more log data so that useful graphs can be made (faction win percentage, average end turn).

GOOD

GUI or scripting interface for hooking in new stat pages Create an interface for looking at new stats. User will have a choice of several graph types and what data sets to use for x and y axes. Done. This is the database driven graph stuff.

Future work that I will do on this project will be focused on improving statistical/graphical analysis of AI vs. AI games.

What I expect to get from this Project

I want to have done a substantial open source contribution. This project will be my first coordinated team effort and hopefully help me get started on other OSS projects. It will also expose me to the process of maintaining a large, polished product.

Would I stay with Wesnoth after GSoC?

I would maintain my code for Wesnoth. I think that I will continue to contribute code in unrelated areas to the project after I am done the project as well.

Practical considerations

Familiarity with Tools

  • Sub­­version - I know how to use s­­v­­n, and I run it on my own server as well.
  • C++ - I know C and Java very well, but I don't know C++. I think I can learn it over the course of a week or two, however.
  • Python - I know python very well.
  • Build Environments - My development environment is linux with GNU make. However, I have used MSVS before but I am not familiar with their solution files. I have a pretty good idea of how to use SCons now.

Development Tools

I use vi, xterm, s­­v­­n, and GDB. I dislike working in IDEs because of the overhead and effort of getting your project set up properly in them.

Fluent Programming Languages

  • Java
  • C
  • Python
  • PHP

I also know LaTeX and HTML, markup languages.

Fluent Spoken Languages

I am a native english speaker. I also am fluent in Russian, but can't read or write it.

Hours of Availability

Until the end of June, I am available from 6 PM EST (22:00 UTC) to 3 AM EST (7:00 UTC). I will either be available during the entire day of 9 PM EST (1:00 UTC) to 10 PM EST (21:00 UTC). I expect to put in full work days of 8 hours on this project.

Phone / Internet Phone Conversations

I would prefer written electronic communication like IRC or email, but I am OK with talking over the phone. It's a bit frustrating to explain your code over the phone, however.

This page was last edited on 21 March 2013, at 00:17.