User:Tephlon

From The Battle for Wesnoth Wiki
Revision as of 23:57, 19 June 2006 by Tephlon (talk | contribs) (wsync)

Very well then, I'll write something here.


Who, me?

I'm tephlon, or rather, that's my nick. Why tephlon? Well, my wife (since April 14th 2006 :)) seems to think that no problems ever stick on me, so she calls me "teflonmannen", which is "the teflon man" in Swedish. My own interpretation is that I have a non-stick memory. The "ph" instead of "f" is just... Well, I don't know. Leet? Whatever.

My real name is Stefan, and I'm born on the very last day of 1976. I'm the maintainer of the Swedish translation and, actually, one of the moderators on the Translations & Internationalization forum, even though I don't usually do or say very much on the forums. I live in Göteborg -- which is "Gothenburg" in, at least, English-speaking countries -- in Sweden, which really is called Sverige. And that's not Switzerland. But you knew that. I hope.

Anyway, since September 2004 I'm the maintainer of the Swedish translation, and for some reason, the progress of the Swedish translation has (unfortunately?) been quite coupled with my working career.

In 2004 I started my fourth year as a PhD student in astrophysics, and had for some time been really fed up with it. At some point during spring I was looking through the games at HappyPenguin.org, and I found Wesnoth. I liked it a lot from the start and recommended it to my SO, who, to my great astonishment, didn't hate computers so much that she didn't appreciate a good game. So we started playing.

In August the same year, I looked into the Translations forum and noticed that the translation was pretty much unmaintained, and since I couldn't care less about my project I decided to try and do something about the translation. In a short time, the Swedish translation team, which basically consisted of me and Sanna, managed to do some great work on the translation. In fact, the Swedish translation was the first translation at 100%, and on September 11 we could proudly announce the 0.8.4 release in Swedish.

Late in December 2004 I quit my grad student position, and was without a job. During this period, which lasted for 8 months from the beginning of 2005, I applied for 140 jobs, and in the meantime I worked on the translation.

Despite all the applications, I only got a job as a mailman, in August. As such I could often get home pretty early, so I could get some stuff done before my girlfriend got home from her job. Working as a mailman wasn't really what I wanted of course, but there wasn't much else to do. The Swedish working market really really really suck for people with university degrees. However, in the beginning of September I got a phone call from a company regarding a position I had applied for in february(!). And since the beginning of October I have a new job at an IT security company, at the one position I really wanted.

Unfortunately, this has proven somewhat bad for the translation. The Swedish translation can barely keep up at being among the top 3-4 translations. Fortunately, we are a few more translators, even though only my wife seems active :) I parry the minor updates on all campaigns and do all proof-reading. We'll see what happens when our first child arrives, sometime around August 27th :)

Thoughts on Translations

I've been meaning to put down my thoughts on translating a game such as this, but this seems a bit harder than I thought.

From the start, I've wanted the translation to be consistent throughout all the text domains. This might seem obvious, but it's harder than it sounds. The msgids in the po-files often come ouf of context, and when playing through a campaign it's not uncommon to stumble over some dialog which sounds really strange. So, I've come to view the translation process as three intertwined phases, or maybe sub-processes; bulk translation, proof-reading, and consistency checking.


Bulk translation

First, there is the work of getting the "bulk text" down. Choosing the word "bulk" might seem a bit condescending, but it's really not, it's just what it is. It is hard, and often boring, work; not very rewarding. During this phase the aim is just to get the strings translated; just going through one string after another, and translate.


Proof-reading

The second part is the proof-reading. Everything which is committed has been proofread at least once. This can be pretty quick, but sometimes one just get stuck on some odd passage. At times it can be weeks before a translation of a single sentence, or even word, is finished, because of the three criteria a msgstr has to fulfill:

  • It has to have the same meaning as the msgid.
  • It has to sound good.
  • It has to be consistent with the translation as a whole.

The first point can be discussed forever; is a more or less literal translation the best, or a complete rewrite which in the end conveys the same message? The Swedish translation goes something in between, and is really dependent on the second point. If a literal translation sounds good, it ought to be used. Sometimes this is not possible, however, so the passage translated has to be reorganized. Then this has to sound good.

What does "sound good" mean, then? Well, firstly, the translation has to use expressions which are actually used in the language one is translating to. Secondly, it has to be written in a way which displays the "rhythm" in what is actually written or perhaps rather -- if it is a dialog -- spoken. If it is a translation of a dialog between two or more characters, the translation has to sound like someone's actually talking.


Consistency checking

The third point is more administrative, since it is basically just to look up how a certain string (for instance unit names) has been translated before. At times this too can be quite troublesome, for instance when it comes to words like Guard, Guardsman, Warder and Sentinel, since Swedish has a hard time distinguishing between these.


Lather, rince, repeat

As I'm sure you understand, these three phases don't come linearly. They have to be mixed, minced and reiterated. The perfect translation is the one where you can't tell what is the original text and what is the translation; when you can hear a dialog as though someone is speaking inside your head; when you don't even think about it; when you don't notice what you read; when it effortlessly brings an image to mind, and enhances your own imagination of what is actually happening.


Translation tools

During the time I've maintained the translation, I've constantly developed and improved a few scripts that help me do this. I have two main scripts where the first one sync my local source repository with the main repository, and the second picks out what I've changed since the last time I committed anything.


wsync

#!/bin/sh

ROOTDIR=$HOME/Wesnoth

SRCDIR=$ROOTDIR/Source
WORKDIR=$ROOTDIR/Translations


[ "x$ROOTDIR" == x ] && exit 1
[ "x$SRCDIR" == x ] && exit 1
[ "x$WORKDIR" == x ] && exit 1

[ -d $SRCDIR ] || mkdir -p $SRCDIR



trap cleanup SIGINT



cleanup () {
    echo -en "\nExiting: running svn cleanup for $SRCDIR/$BRANCH... " && ( cd $SRCDIR/$BRANCH && svn cleanup ) && echo "Done."
    exit 9
}



treesync () {
   [ $# -eq 2 ] || exit 1

   REPOS="$1"
   BRANCH="$2"

   echo -e "Syncing \033[01;32m$BRANCH\033[0m with \033[01;33m$REPOS\033[0m:\n"

   if [ -d $SRCDIR/$BRANCH ] ; then
      ( cd $SRCDIR/$BRANCH && svn update ) || ( echo -e "\nRunning svn cleanup and svn update:\n" && cd $SRCDIR/$BRANCH && svn cleanup && svn update )
   else
      ( cd $SRCDIR && svn checkout $REPOS $BRANCH )
   fi

   echo
}



quicksync () {
   [ $# -eq 2 ] || exit 1

   REPOS="$1"
   BRANCH="$2"

   echo -e "Quick-syncing \033[01;32m$BRANCH\033[0m with \033[01;33m$REPOS\033[0m:\n"

   if [ -d $SRCDIR/$BRANCH/po/ ] ; then
      ( cd $SRCDIR/$BRANCH/po/ && svn update ) || ( echo -e "\nRunning svn cleanup and svn update:\n" && cd $SRCDIR/$BRANCH/po/ && svn cleanup && svn update )
   else
      echo -e "\nRun a complete sync first - not a quick one!\n"
      exit 2 
   fi

   echo
}



posync () {
   BRANCH="$1"
   REPLACE="$2"

   for POFILE in $(find $SRCDIR/$BRANCH -name 'sv.po' | sort -s) ; do
      PODIR=${POFILE%*/sv.po}
      POTFILE="$(find $PODIR/ -name '*.pot')"
      DOMAIN="$(echo ${POTFILE##*/} | sed 's|\.pot$||')"

      if [ $(echo $POTFILE | grep -c .) -eq 0 ] ; then
         echo "Not pot file found in $PODIR."
      else
         echo -en "Syncing \033[01;32m$BRANCH/$DOMAIN\033[0m: "

         [ -d $WORKDIR/$BRANCH/$DOMAIN ] || mkdir -p $WORKDIR/$BRANCH/$DOMAIN

         cp $POTFILE $WORKDIR/$BRANCH/$DOMAIN/sv.pot

         if $REPLACE ; then
            if [ $WORKDIR/$BRANCH/$DOMAIN/sv.po -nt $POFILE ] && ! cmp -s $WORKDIR/$BRANCH/$DOMAIN/sv.po $POFILE ; then
               tput hpa 50
               echo -n "WARNING! Work po-file was changed more recently than upstream po-file. Replace? [yN]"
               read ANSWER
               if [ "x$ANSWER" != "xy" ] ; then
                  tput hpa 50
                  echo "Skipping!"
                  continue                
               fi
            fi

            tput hpa 50
            echo -n "Replacing... "

            if [ -f $WORKDIR/.LastCommit/$BRANCH/$DOMAIN/sv.po ] ; then
               cp $POFILE $WORKDIR/.LastCommit/$BRANCH/$DOMAIN/sv.po
            fi

            cp $POFILE $WORKDIR/$BRANCH/$DOMAIN/sv.po

         else
            tput hpa 50
            echo -n "Merging... "

            if [ -f $WORKDIR/.LastCommit/$BRANCH/$DOMAIN/sv.po ] ; then
               msgmerge -N -q --update $WORKDIR/.LastCommit/$BRANCH/$DOMAIN/sv.po $POTFILE
            fi

            if [ -f $WORKDIR/$BRANCH/$DOMAIN/sv.po ] ; then
               msgmerge -N -q --update $WORKDIR/$BRANCH/$DOMAIN/sv.po $POTFILE
            else
               [ -d $WORKDIR/$BRANCH/$DOMAIN ] || mkdir -p $WORKDIR/$BRANCH/$DOMAIN
               msgmerge -N -q $POFILE $POTFILE > $WORKDIR/$BRANCH/$DOMAIN/sv.po
            fi
         fi

         tput hpa 50
         echo -n "Counting... "
         tput hpa 50
         msgfmt --statistics $WORKDIR/$BRANCH/$DOMAIN/sv.po -o /dev/null
      fi
   done

   echo
}



echo

MODE=treesync
REPLACE=false

while [ "x$1" != "x" ] ; do
    if echo "$1" | grep -q r ; then
        echo -e "WARNING! Will replace po-files!\n"
        REPLACE=true
    fi

    if echo "$1" | grep -q q ; then
        MODE=quicksync
    fi

    shift
done


eval "$MODE http://svn.gna.org/svn/wesnoth/trunk Wesnoth-trunk"
eval "$MODE http://svn.gna.org/svn/wesnoth/branches/1.0 Wesnoth-1.0"
treesync svn://svn.berlios.de/wescamp-i18n Wescamp

posync Wesnoth-trunk $REPLACE
posync Wesnoth-1.0 $REPLACE
posync Wescamp $REPLACE

wcommit

#!/bin/sh

ROOTDIR=$HOME/Wesnoth

SRCDIR=$ROOTDIR/Source
WORKDIR=$ROOTDIR/Translations
COMMITDIR=${WORKDIR}/Commit


[ "x$ROOTDIR" == x ] && exit 1
[ "x$SRCDIR" == x ] && exit 1
[ "x$WORKDIR" == x ] && exit 1
[ "x$COMMITDIR" == x ] && exit 1

[ -d $COMMITDIR ] || mkdir -p $COMMITDIR
[ -d $COMMITDIR/Old ] || mkdir -p $COMMITDIR/Old

mv $COMMITDIR/*.tgz $COMMITDIR/Old 2> /dev/null
mv $COMMITDIR/*.tbz2 $COMMITDIR/Old 2> /dev/null

TEMPFILE=$(mktemp -t po-XXXXXX) || exit 2

TODAY=$(date +%Y%m%d)




################################################################################
# The podiff function below is a hideous hack to make sure that the work 
# po-file really differs from the source and also from what was last committed.
# The function returns false if the header (msgid "" on top of the po-file) and 
# obsolete (#~) entries but nothing else is different. 
#
# A lot of the ugliness below comes from the crippledness of gettext's msg-
# functions...
#
# So, we want to compare the work po-file with what was last committed and the
# source, but not take differing headers or obsolete entries into account. How
# do we do this? Or rather, how do we _have_ to do it?
# 
# To be able to omit the headers we _have_ to use msgcomm as this is the only 
# gettext function that has an option for this. Then, to not take obsolete 
# entries into account, we have to use the msgattrib function, as this is the 
# only gettext function that takes _this_ into account.
################################################################################

podiff () {
    WORKPO="$1"
    SOURCEPO="$2"
    LASTCOMMITPO="$3"

    ################################################################################
    # Check first if the work po-file has not been altered at all. If it hasn't, 
    # it is completely unnecessary to continue. STATE is true if the work po-file 
    # should be committed.
    ################################################################################

    if diff -q $WORKPO $LASTCOMMITPO 2>&1 > /dev/null || diff -q $WORKPO $SOURCEPO > /dev/null ; then
        STATE=false
    else
        CMPDIR=$WORKDIR/.Compare
        ERRORLOG=/dev/null

        WORKCLONE=$CMPDIR/work-clone.po
        SOURCECLONE=$CMPDIR/source-clone.po
        LASTCOMMITCLONE=$CMPDIR/lastcommit-clone.po

        WORKSTRIPPED=$CMPDIR/work-stripped.po
        SOURCESTRIPPED=$CMPDIR/source-stripped.po
        LASTCOMMITSTRIPPED=$CMPDIR/lastcommit-stripped.po

        if [ -d $CMPDIR ] ; then
            rm $WORKCLONE    $SOURCECLONE    $LASTCOMMITCLONE    2> /dev/null
            rm $WORKSTRIPPED $SOURCESTRIPPED $LASTCOMMITSTRIPPED 2> /dev/null
        else
            mkdir -p $CMPDIR
        fi

        ################################################################################
        # First, we create clones of these three files, since msgcomm complains if the 
        # same file is used as input files.
        #
        # Next, we use msgcomm with the --omit-header option on the original po-file and
        # its clone. It is convenient to remove the obsolete entries with msgattrib and
        # --no-obsolete in this step, as this can be done in a pipe.
        # 
        # There is one important fact here. If --omit-header is used, msgcomm can't cope
        # with extended characters. The same is true for msgattrib and --no-obsolete.
        # This can be remedied by using the --escape/-E option on both msgcomm and 
        # msgattrib.
        ################################################################################

        cp $WORKPO       $WORKCLONE
        cp $SOURCEPO     $SOURCECLONE
        cp $LASTCOMMITPO $LASTCOMMITCLONE

        msgcomm -E --omit-header $WORKPO       $WORKCLONE       2> $ERRORLOG | msgattrib -E --no-obsolete -o $WORKSTRIPPED       2> $ERRORLOG
        msgcomm -E --omit-header $SOURCEPO     $SOURCECLONE     2> $ERRORLOG | msgattrib -E --no-obsolete -o $SOURCESTRIPPED     2> $ERRORLOG
        msgcomm -E --omit-header $LASTCOMMITPO $LASTCOMMITCLONE 2> $ERRORLOG | msgattrib -E --no-obsolete -o $LASTCOMMITSTRIPPED 2> $ERRORLOG

        ################################################################################
        # NOW we can do the real comparison. diff -q returns true if the files are 
        # _equal_. STATE is true if the work po-file should be committed.
        ################################################################################

        if diff -q $WORKSTRIPPED $LASTCOMMITSTRIPPED 2>&1 > /dev/null || diff -q $WORKSTRIPPED $SOURCESTRIPPED > /dev/null ; then
            STATE=false
        else 
            STATE=true
        fi

        rm $WORKCLONE    $SOURCECLONE    $LASTCOMMITCLONE    2> /dev/null
        rm $WORKSTRIPPED $SOURCESTRIPPED $LASTCOMMITSTRIPPED 2> /dev/null
    fi

    $STATE
}



pocommit () {
   BRANCH="$1"

   TARBALL="thisshouldnevermatchafilenameandseeificareifitwouldanyway"

   echo -e "Checking \033[01;33m$BRANCH\033[0m:"

   for POFILE in $(find $SRCDIR/$BRANCH -name 'sv.po' | sort -s) ; do
      POTFILE="$(find ${POFILE%*/sv.po}/ -name '*.pot')"
      DOMAIN="$(echo ${POTFILE##*/} | sed 's|\.pot$||')"

      echo -en "    \033[01;32m$DOMAIN\033[0m"
      tput hpa 40

      [ -d $WORKDIR/.LastCommit/$BRANCH/$DOMAIN ] || mkdir -p $WORKDIR/.LastCommit/$BRANCH/$DOMAIN
      [ -f $WORKDIR/.LastCommit/$BRANCH/$DOMAIN/sv.po ] || touch $WORKDIR/.LastCommit/$BRANCH/$DOMAIN/sv.po

      ################################################################################
      # I have forgotten why I do "-o $TEMPFILE" below instead of "--update". Let's 
      # ponder...
      #
      # The reason for doing this msgmerge in the first place, is to make the 
      # .LastCommit and the work po-files have correct (the same) line-wrapping, so as 
      # to make a proper comparison in the following long if statement.
      #
      # So, my guess is that I - once upon a time - realized that msgmerge doesn't 
      # touch the po-file if it is up-to-date with the pot-file, and I want to correct 
      # the line-wrapping regardless of whether the po-file matches the pot-file or 
      # not. In fact, it is irrelevant if the po- and pot-files are matching. What 
      # _is_ relevant, is to make a proper comparison between the .LastCommit po-file 
      # and the work po-file.
      #
      # What does line-wrapping have to do with anything? Well, kbabel doesn't wrap 
      # the msgstr lines at all, whereas msgmerge does line-wrapping. And every 
      # po-file that is submitted to the SVN is msgmerged - or at least it is bound to 
      # be when the next pot-update is being made.
      ################################################################################

      msgmerge -q -o $TEMPFILE $WORKDIR/$BRANCH/$DOMAIN/sv.po $POTFILE && mv $TEMPFILE $WORKDIR/$BRANCH/$DOMAIN/sv.po || exit 3
      msgmerge -q -o $TEMPFILE $WORKDIR/.LastCommit/$BRANCH/$DOMAIN/sv.po $POTFILE && mv $TEMPFILE $WORKDIR/.LastCommit/$BRANCH/$DOMAIN/sv.po || exit 3

      if [ -f $WORKDIR/$BRANCH/$DOMAIN/sv.po ] && podiff "$WORKDIR/$BRANCH/$DOMAIN/sv.po" "$POFILE" "$WORKDIR/.LastCommit/$BRANCH/$DOMAIN/sv.po" ; then
         if [ -e "$TARBALL.tar" ] ; then
            TARFLAGS="rf"
         else
            TARFLAGS="cf"

            NUMBER=$(find $COMMITDIR/Old/ -iname "$BRANCH-sv-$TODAY-*.tbz2" -printf '%f\n' | sort -g | tail -1 | rev | cut -f1 -d'-' | rev | cut -f1 -d'.')
            [ x$NUMBER == x ] && NUMBER=0
            NUMBER=$((NUMBER + 1))
            TARBALLNAME="$(echo $BRANCH-sv-$TODAY-$NUMBER | tr '[:upper:]' '[:lower:]')"
            TARBALL="$COMMITDIR/$TARBALLNAME"
         fi

         echo -en "Adding to \033[01;33m$TARBALLNAME.tbz2\033[0m: "
         msgfmt --statistics $WORKDIR/$BRANCH/$DOMAIN/sv.po -o /dev/null

         ( cd $WORKDIR/$BRANCH && tar $TARFLAGS $TARBALL.tar $DOMAIN/sv.po ) && cp $WORKDIR/$BRANCH/$DOMAIN/sv.po $WORKDIR/.LastCommit/$BRANCH/$DOMAIN/sv.po

      else
         echo "No changes."
      fi
   done

   if [ -e "$TARBALL.tar" ] ; then
      bzip2 "$TARBALL.tar" && mv "$TARBALL.tar.bz2" "$TARBALL.tbz2"
   else
      echo "No changes in $BRANCH."
   fi

   echo
}



restyle_wescamp () {
    for TARBALLPATH in $(find $COMMITDIR -name 'wescamp-*' -maxdepth 1) ; do
        TARBALL="${TARBALLPATH##*/}"

        TEMPDIR=$(mktemp -d -t wescamp-XXXXXX) || exit 3
        mkdir $TEMPDIR/Original
        mkdir $TEMPDIR/NewStyle

        tar xj -C $TEMPDIR/Original/ -f $TARBALLPATH

        for POPATH in $(find $TEMPDIR/Original -name 'sv.po') ; do
            DOMAIN="$(echo $POPATH | rev | cut -f2 -d'/' | rev)"
            WDIR=$TEMPDIR/NewStyle/${DOMAIN#wesnoth-*}-po/po/

            mkdir -p $WDIR
            cp $POPATH $WDIR
        done

        ( cd $TEMPDIR/NewStyle/ && tar cjf $TARBALL * )
        mv $TEMPDIR/NewStyle/$TARBALL $TARBALLPATH
    done
}



pocommit Wesnoth-trunk
pocommit Wesnoth-1.0
pocommit Wescamp

restyle_wescamp