Git for Wesnoth Crash Course

From The Battle for Wesnoth Wiki
Revision as of 04:09, 27 February 2014 by Iceiceice (talk | contribs) (Setting up our workflow for wesnoth in git)

Foreword by iceiceice: This page is intended as a crash course for new wesnoth devs and contributors who have never used git before. I'm hoping that it will be written *mainly* by newbies like myself who have just figured out enough to use it successfully, and therefore not contain a bunch of information superfluous for that purpose. At the time that I am writing this, I have managed to write about 8 github pull requests and get them merged into wesnoth, and I hope you will be able to do the same after reading this.

What is git?

The most effective *absolute first timer* intro to git that I found when I began is here: http://gitref.org/index.html

I suggest that you read through this cover to cover, flipping through the pages using the red arrows at the bottom of the page. You can skip the part about "stashing", but you should read carefully and understand the stuff about:

  • cloning a repo
  • checking out a branch, creating a new branch, merging branches
  • adding and committing changes
  • taking diffs to make sure you understand exactly what you are committing
  • resetting when you messed up and need to go back a few steps, and using git log and git reflog to assist with this
  • moving content between local and remote locations, with push and pull

Setting up our workflow for wesnoth in git

It is common in reading about git to see the terms "upstream / downstream". This is often confusing at the beginning -- after all information is usually flowing in both directions, sometimes you are getting code from the wesnoth repo, and sometimes you are sending code back. Which way is up / down?

The answer is that like a river, "downstream" is the direction most information flows. Any user that simply wants to build wesnoth from source, and not make any changes, will generally do so by cloning / pulling github.com/wesnoth-old. So wesnoth-old is the upstream repo, and the users are downstream. Since you are a developer, you will be forking the upstream repo and making changes, and hopefully with work eventually get these changes pulled into the upstream repo. That is the development cycle, and sending info back upstream is what makes you a developer.

In this guide we are going to assume that you will use *github* besides just the git command line. The reason for this is that github has a lot of nice tools to offer and makes some things really easy, especially with comparing your branch to master and seeing diffs in a nice graphical interface, and with making pull requests. Technically, this means you will actually control *two* repos -- the local one on your machine, and the fork on github. The first step is to set these two things up. Basically, we will follow the instructions here:

https://help.github.com/articles/fork-a-repo

So the steps are:

  1. Fork the main wesnoth repo, wesnoth-old, and tie the fork to your user account.
  2. Clone the fork onto your local machine, using git clone.
  3. Configure remotes.
    On that github page, it is described how to configure the main wesnoth repo under the name "upstream". Because we cloned from your fork, the fork is automatically configured under the name "origin". You can see your configuration status with "git remotes".

You might find this naming scheme confusing, as the name "origin" might also aptly describe the main wesnoth repo. So it might be a good idea to reconfigure that repo under the name "fork". But if you do, you should keep in mind that much of the github help documents will assume the term "origin".

Additionally, we ask that you configure your actual full name into git, following instructions here:

https://help.github.com/articles/setting-your-username-in-git

so that we can associate an actual person to every commit that makes it into the game. Your *github account* should also have your real name as well.

The basic workflow

Here's the picture we now have, with 3 repos:


   upstream


                    origin


    local


There are three basic steps in the development cycle.

  1. Syncing with upstream and making a topic branch for your patch.
  2. Committing your changes and pushing them to your fork.
  3. Making a pull request, getting it pulled, and cleaning up at the end.

Syncing and Branching

In this step, you will make sure your local master is up to date with the most recent development changes before beginning work.

First make sure you are on master:

 git checkout master
 git status

Now, pull the upstream master to your machine:


   upstream
      |
      |
      |             origin
      |          
      v      
    local


 git pull upstream master

At this point your local master is up to date. While you are at it, you might as well sync up your *origin* as well:


   upstream
        
       
                    origin
              /-->  
           --/    
    local     


 git push origin master

So at this point, master should be the same everywhere. In this guide, we will never make any changes directly to master, and will *only* make changes to the topic branch, so no matter which repo you look at, master should be the same, and git will be comparing your work against the main wesnoth master.

Now we are ready to branch. We are on master, as a call to "git status" will confirm. So now we will make our new branch:

 git checkout -b great_new_feature

As you learned in the gitref guide, you could also have done this by

 git branch great_new_feature
 git checkout great_new_feature

Since we were on master just before we did this, the current most up-to-date master will be the point of departure for our branch, which is the best thing to avoid possible conflicts when it is eventually merged in.

Committing your changes and pushing to your fork

Now, you will make your changes to wesnoth and test them, using git add and git commit, as you learned in the guide.

Rules of thumb:

  1. Write a good commit message.

    Basically, a commit message explains what happened and why in the commit. A good commit message is structured like a short email. The first line has special importance -- it should ideally be at most 80 characters, and play the roll of the "subject" of your email. These should be written in the imperative tense e.g. "add field to object and create accessor methods", "fix broken constructor", "remove unnecessary #include", etc.

  2. Make appropriately sized commits.

    Each commit should correspond to a single logical step in your programming task. The commits are what make up the development history -- people will look at your commits to try to figure out how we got to where we are today, and if necessary, may need to revert some commits / revert to an earlier state to fix something that might be broken in the future. So one rule of thumb is, if you can't write a succinct commit message explaining what happened, then the commit is too large and too complicated, so break it up into fewer pieces. Another guideline is, you should at least think about committing almost every time you compile (depending on your habits).

    On the other hand, you shouldn't make a commit for every line of code typed. Generally if it would never make sense to revert to the current point in code, you probably shouldn't commit. So for example, if you define a field of an object, but it has no accessors and no way to be used in code, it would never make sense to revert to that time in history, so those changes should be rolled together in one commit. Also, if you have a commit that just consists of "fixed some whitespace", that should probably be squashed in with another commit.

    It is better to commit more often than less often, as before you are done you will have the opportunity to clean things up a bit, and it is much easier to squash commits together than to break up a commit that is too big.

  3. When you are done, make a changelog entry, and commit it with the message "update changelog".

    We also have a players_changelog, which should contain only changes players are likely to notice, and be less technical than the main technical changelog. There is also a file RELEASE_NOTES which basically gets turned into a forum post when the next release comes around, so if there is a major feature which should be advertised, that should go in there. Additionally, if this is your first commit, the devs will ask you to make an appropriate note about yourself in data/about.cfg

Finally, when your branch is ready, push it to your fork.


   upstream
       
       
                    origin
              /-->  
           --/    
    local     


 git push origin great_new_feature


Now, if you navigate to the page for your fork, you will see "recently pushed branch great_new_feature (3 minutes ago)" on your main master page. Click "compare" to compare it against master. You will now see a list of the commits you made with their messages, and a color-highlighted diff of all the files you modified. You can click through the commits in order to see each of the changes you made and understand how your project progressed -- this is exactly what the devs will do when they review your pull request.

In the github interface, you can always see a list of branches by clicking on "branches" in the line near the middle of the page

"10,000+ commits 27 branches 248 releases 85 contributors"

You can select the current branch using the drop down menu, and you can compare branches using the "compare" button.

Cleaning up your branch

Before making your pull request, if you looked at the diffs and the history and saw anything confusing or not quite right, now is your opportunity to change it. From the gitref guide, you know how to add more commits, and you know that you can use reset to go back in time and redo things. Reset is especially good if your last commit was too big and you want to break it up into a series of smaller commits -- the command 'git reset HEAD^' will leave your working directory the same, but jump back in time one commit and unstage all of those changes, so you can add a smaller subset of the files, commit those, then add the others, etc.

More generally, using 'git reflog' you can see how to reset back any number of commits / commands.

You can also use 'git commit --amend' to edit the commit message of the most recent commit.

There are several more advanced cleanup features available with git, that will allow you to completely rewrite the history of your topic branch exactly as you want it, but for right now in the guide we'll defer any further discussion.

Every time you make changes, you will want to push to your origin to sync them up again. If you rewrote your history, then the push will be rejected, because the histories won't match up. Therefore, you will have to *force push* which will discard the old history and replace it with the new one:


   upstream
         
        
                    origin
              /-->  
           --/    
    local     


 git push --force origin great_new_feature

You can and should do this throughout your development if you want to look at the diffs of your changes as you work -- after all, the wesnoth source code is vast and it is easy to forget exactly where you are in your project. And you can keep cleaning up this way until you are satisfied. In this way, your github fork ends up being sort of like a personal web-based IDE plug-in / extension to help you with development.

Making a pull request

When you are finally done, select your branch on your fork, and click the green button to create a pull request. By default, the target will be wesnoth-old:master, on the main repo. That's pretty much all there is to it -- github has a guide here: https://help.github.com/articles/creating-a-pull-request


   upstream
            <--\
                \--
                    origin
 
 
    local     


You'll then see your pull request appear in the "pull requests" section of the main wesnoth repo. At the time of writing this, we also have "travis-ci", a "continuous integration system" configured on our repo. Travis will automatically try to compile wesnoth with your changes, as a reality check before your pull request is merged. It is not mandatory for the travis build to pass though, and additionally you may still make changes to your pull request by pushing more changes to your fork. Travis *should* then try to compile the updated list of changes after this. You can even still force push to erase the content of the PR and replace it with new content -- however you shouldn't make your PR until you are ready for it to be merged.

Now you just have to wait for a member of the development team to review your pull request. You can often find someone on the irc channel, #wesnoth-dev. Additionally devs may comment directly on your PR with questions, so you might check it periodically for this.

Cleaning up at the end

Congratulations, your PR got merged! Now its time to cleanup and resync with the new changes. On the github page for your PR, you will now see a button with the option "delete this branch". Since the content of great_new_feature is now merged into master, this branch no longer needs to exist as you won't work on it anymore, and future work will go onto a different branch. So you should delete it, which will delete the branch on *origin*, your github-associated repo.

Since master has now changed and we want master to be synced up, you should now do step 1 again:


   upstream
      |
      |
      |             origin
      |          
      v      
    local


 git checkout master
 git pull upstream master

At this point git will understand that great_new_feature has been merged into master, so when you ask to delete this branch from local as well, it will do it:

 git branch -d great_new_feature

If you do these steps out of order, i.e. try to delete the branch before before syncing master, it will make a warning "Are you sure? You have unmerged changes on this branch..." etc.

Finally, sync master on the origin as well:


   upstream
         
         
                    origin
              /-->  
           --/    
    local     


 git push origin master

That's the end of the development cycle, and now you can begin on your next patch. Note that throughout the cycle, information only flowed counter-clockwise:


   upstream
      |     <--\
      |         \--
      |             origin
      |       /-->  
      v    --/    
    local


If for some reason you find yourself doing something where information is flowing the other way, that is a good sign that something is going wrong, at least as far as this guide is concerned! (The exception is the steps in which we set up our repos -- don't overthink this.)


The ultimate cleanup power tool: git rebase

Using things in the gitref guide, you can see how to use reset to undo mistakes you made by backing up and trying again, and for small commits that is probably fine. However, if you have a large and complicated series of changes, the better way to do this is using git rebase. Using git rebase in interactive mode, you can rewrite history by

  • reordering your commits
  • discarding unwanted commits
  • squashing commits together
  • editing the content changes represented by an individual commit
  • rewrite/edit the commit message

For our purposes, you will only need to use the form

 git rebase -i master

You can see some documentation for using this command here: https://help.github.com/articles/interactive-rebase

Using 'git rebase', you can make your commit history very *clean* -- instead of the history saying "I tried a bunch of things, undid some of them, tried some other things, and found something I liked", your commit history can basically be "I took the simplest and most logical route to the solution that was finally the best". When you review your commits, instead of feeling like a nasty blood-and-elbow-grease engineering project, it should feel like folding a perfect origami crane, to the extent possible ;)

But to reiterate, this has a purpose.

  1. Other devs need to be able to understand the history.
  2. It should be possible to jump back in time by checking out one of your commits and find ourselves in a sensible state when we do so.
  3. If something breaks or some feature become incompatible, it should be possible to roll back only a piece of the implementation of your feature without removing the whole thing.

If you want to use rebase to cleanup your history, it is important to do it *before* it is pulled into wesnoth, as once that happens it is basically no longer possible. If we rewrite history in the main wesnoth repo, then afterwards whenever any dev tries to 'git pull upstream master', their git will give them strange merge errors, stemming from the incompatible history, and then they will become nervous, jump on irc and exclaim some variation of "omg wtf bbq". So except in extreme circumstances, we will never rebase the main wesnoth master, and for the same reason, we will never use "git push --force upstream master". That's why if you want to do these things, it is important to do them on your *fork* before it makes it onto master.

If you feel like it, you might read this, which is a historical email between early developers of git, in which Linus explains about shared history of repos and when it is appropriate to use git rebase.

http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg39091.html

When you become comfortable with git rebase -i, it will greatly improve your workflow, as you will know that you can easily review and revise commit messages later, and easily reorder and squash commits. Typically I will now commit every single time I compile, and even if it is a commit which e.g. fixes a compile time error, or for some other reason I know it will be squashed in somewhere else, I give it the commit message "fixup", as a note to myself to mark it thusly in the first git rebase pass. (Actually, while writing this I have just learned about the --autosquash feature of git rebase which takes this idea further, good stuff there.)

As a final note, git rebase can also be used to help resolve merge issues, should they arise. For example, suppose that you modify a file in the same place as someone else, and their change gets merged before yours. If git can't figure out how to merge then someone will have to fix it. One way that you could fix it is to resync your master to get the new changes locally, and then rebase your topic branch so that your changes are applied *after* the most recent changes.

To understand what is happening, we should understand a bit more about how git works. Intuitively, git is all about "snapshots" of the project content. Every commit represents a snapshot, and you can look at this snapshot by checking out the commit. However git does not store each snapshot separately -- instead git defines the snapshots in terms of one another, and stores only the diffs. When you rebase your topic branch onto master, this will also update the "point of departure" where your branch was created, so that in the history, it will depart from the current head of master with the most recent changes. If merging your branch with the current master would create a conflict, then when git rebase gets to the commit that creates the conflict, it will get confused when it tries to apply that diff, stop, tell you about the problem, and ask you to resolve it -- you can open up the offending file and go to the point where the conflict is happening, and git will leave a note for you of the content it is having trouble with. The note will look similar to what happens in the gitref guide here, under "merge conflicts".

http://gitref.org/branching/#merge

You just have to remove the note, make the file look like it should to make everything compatible, and type

 git add .
 git rebase --continue

Then the rebasing process will continue, and at the end your branch should be compatible with master and ready to be merged in.

You can read more about this aspect of git rebase here if you like, although most likely you won't actually need more than we just talked about for wesnoth development.

http://git-scm.com/book/en/Git-Branching-Rebasing

Note that unlike Linus and github, in this guide we aren't thinking about your github fork repo as public -- we prefer to think of it as your personal private IDE/tool as I described earlier. If other wesnoth devs are cloning it or pulling your topic branches, we assume you will work it out with them. With that caveat the philosophies expressed on github and by Linus should generally apply to the wesnoth project as well.

That's all for this guide, have fun hacking on wesnoth!