Translation Maintainance Commands

From The Battle for Wesnoth Wiki

Foreword

This page is meant to document the current commands used for updating translation files that are sent in. Beside this it also is meant to list steps required to add new campaigns (importing from wescamp and changing textdomains) as well as new languages.

This version is based on the stuff required as of 1.6.1. It is based on autotools/configure usage atm but should be easily addaptable to cmake and scons, too.

Prerequisites

  • svn checkout of Wesnoth with commit privs (eg svn+ssh://USERNAME@svn.gna.org/svn/wesnoth/branches/1.6/ )
  • all the autotools foo: as required for building with autotools, too; basically autoconf and automake; when relying on a different build system cmake or scons are required
  • gettext: required for the game anyway, but some gettext tools are used like msg*whatever* (don't know exactly which, magically done by the buildsystem...)
  • po4a: required for documentation generation like manual and manpages, not for normal updates
  • docbook: required to generate the manual, not for normal updates; not sure which parts of docbook at needed, these are the ones I have currently installed on my gentoo box:
    • app-text/build-docbook-catalog-1.4
    • app-text/docbook-xml-dtd-4.2-r2
    • app-text/docbook-xml-dtd-4.4-r1
    • app-text/docbook-xml-dtd-4.5
    • app-text/docbook-xsl-stylesheets-1.74.0
  • a terminal, preferably with bash available, commands posted here work nicely with bash and are easily put into a shellscript to automatize things.

After getting the svn checkout, run ./autogen.sh in the root dir of the checkout to make sure that all the links and make targets are created. So basically to do so a complete and working build environment has to exist.

Folder structure and files relevant for translations

Stuff relevant for translations is basically in po/. In there all the translation catalogs as well as the translation files and some "helper files" are placed.

In po/ each subfolder stands for a different textdomain:

  • the folder wesnoth/ stands for the textdomain wesnoth (includes many basic things) * the folder wesnoth-utbs/ stands for the textdomain wesnoth-utbs and includes all the strings only relevant for the campaign Under the Burning Suns
  • ...

In each folder several files are available:

TEXTDOMAIN.pot

Catalog with all the original strings. It includes no translated string, a copy of this file is basically what the gettext tools use to start a new translation. This is the file that is basically updated in a so called pot-update. Against this catalog file all po files are merged when they are updated, only English strings in this file determine what has to be in the po-files as "original string". The file is named after the folder it is in.

*.po

These are the "real" translation files. They are named after the language they belong to, based on the "langcodes" for those. Eg the file for the german translation is named de.po. With these files translators do work and from these the .mo files are created when "compiling" them. The files basically consist of a header, an area of "active strings" (starting right after the header) and and area used for "old and now unused" strings (commented out using lines starting with #~. All lines starting with a # are commented out, though there are various kinds of comments. These comments are defined by gettext, here a short explaination about them and their syntax, basically the beginning of the lines defines the type:

  • #. My Text: My Text is some automatically extracted comment from the sources (translation hints, status stuff, all this foo).
  • #, fuzzy: This string was seen as "reasonably close but not identical" when the file was updated. Translators should check this string and remove this line if the translation is correct. Unless this is done, the string is not shown translated in the program. This is the complete content of the line
  • #: path/to/file:1: File reference. This give a hint on where in the files this string was extracted from. Basically a filename:linenumber syntax is used. Only one file per line is listed, but several of those lines can be listed per string.
  • # My comment: My comment is a "Usercreated comment". Basically a comment the translator added. Anything the translator wants can be in here...
  • #~: Stuff following after #~ is not active in the corresponding .pot file. Those are mainly old strings that were removed and are left over for the case that it gets back again into the game.

In general translations in the po file consist over (several) of those comments and two other blocks:

  • msgid "Original String": the original string used that the translators have to work on.
  • msgstr "Translated Version": the string that is used as replacement when the translation this file belongs to is used and a case of "Original String" is meat (where the original has to be marked as translateable!).

In general each of those strings can span over several lines, but to have a newline in the translation (or original) an explicit \n has to be added. The only constant thing is that strings have to be encapsulated by " and only the stuff inside of the " counts. Additionally to those "normal forms" the strings can also exists as plurals, here an example from the game and a corresponding German translation ([0] means singular, [1] means first plural form, [2] means 2nd plural form and so on):

#: src/actions.cpp:2485
msgid "Friendly unit sighted"
msgid_plural "$friends friendly units sighted"
msgstr[0] "Verbündete Einheit gesichtet"
msgstr[1] "$friends verbündete Einheiten gesichtet"

One more thing users should be aware of: The translated and original strings have to start *and* end with the same number of \n. A mismatch in those numbers is the most common case for errors.

LINGUAS

This file includes a space separated list of all langcodes supported. As of 1.6.1 this is the current content of LINGUAS (changes when new langs are added or old ones are removed):

af ar bg ca ca_ES@valencia cs da de el en_GB es eo et eu fi fr fur_IT gl he hr hu id is it ja ko la lt lv mk mr nl nb_NO pl pt pt_BR racv ro ru sk sl sr sr@latin sv tl tr zh_CN zh_TW

A LINGUAS file exists in every subfolder of po/. The one right in po/ is used by scons and cmake, the one in each textdomain dir is used by autotools (no guarantee that this is really correct, but it should be...). Basically this is used to have a list of files that have to be compiled and such when building the game.

FINDCFG

This file in each textdomain dir includes the find term used to find all WML files relevant for this domain. Main task is to reduce time when running a pot-update so that not all files are checked, but only the relevant ones. This is mainly used for campaigns to only check the campaigns folder and stuff like this. Example:

find data/core/units -name '*.cfg' -print
find data/campaigns/*/units -name '*.cfg' -print

POTFILES.in

In this file in each textdomain dir all the c++ files are listed that should belong to this domain. Basically it is just a plain list of the files and nothing more.

src/config_cache.cpp
src/construct_dialog.cpp
src/filechooser.cpp
src/font.cpp
src/game_preferences.cpp
src/game_preferences_display.cpp

Makevars

This file in each textdomain dir is for some varibales used by autotools when generating files. Don't ask for what exactly it is, in general it has not to be touched by humans at all. The only thing you should make sure is to add correct entries for textdomains in the DOMAIN = lines.

CMakeLists.txt

Providing information for cmake to generate the required commands. Identical for basically all textdomains.

remove-potcdate.sin

Some strange file available in each textdomain dir. Used by the tools, no idea how it is done and if it is really needed.

Generated files

Basically all the others files not listed here are generated by the build systems (or specific for documentation/po4a related stuff, described below). That especially includes *.mo or *,gmo files which include the "compiled" translations in a form that gettext can directly use them ingame, too.

Documentation/po4a specific stuff

Makefile

po4a based textdomain only (normally autogenerated by autotools)

A hardcoded makefile for po4a stuff. This should eventually be replaced by a generated file from eg cmake. Maybe ettin knows more about this stuff...

TEXTDOMAIN.cfg

po4a based textdomain only

A file with some po4a based config, also duplicating some stuff that normally would be available eg in LINGUAS. This should eventually be changed in some way, too. ATM it is there because, hmm, it is there... Some po4a specific variables are configured here, maybe ettin knows more about this stuff...

manpages

To manpages are placed in doc/man/. Currently the originals are wesnoth.6 and wesnothd.6. Translations are generated from po/wesnoth-manpages/ and placed in doc/man/LANGCODE/ after they are generated using po4a. When new files are generated, they have to be added to svn ( svn add doc/man/*/*.6 ) and also in the autotools based makefile to have them installed (at the top of doc/man/Makefile.am ). Manpages are only created if at least 80% of the strings of the respective manpages are translated.

html manual

The base file for the manual is doc/manual/manual.txt. Using the po4a command stuff together with docbook the files doc/manual/manual.LANGCODE.html are generated. After generation of the files, they have to be added to svn, too ( svn add doc/manual/manual.*.html ). Translations can also come with own images, those should be placed in doc/manual/images/LANGOCE/ .

pot-update

In a pot-update all the reference files (po/wesnoth*/*.pot) are updated and the respective .po files are updated against those reference files. The commands I use to update all pot and the according .po files is:

cd $CHECKOUT
svn up
cd po/
make update-po

While doing so look out for any error messages from the gettext tools. Afterwards just commit stuff with a plain "svn ci". The "svn up" is only needed to be sure that all files are up to date (the ones with the original strings as well as the translation files).

regenerating doc files

To update the po files for documentation and generate the according translated manpages and manuals, use these commands:

cd $CHECKOUT
svn up
make update-po4a

Be aware that this requires po4a as well as docbook and takes considerable time. After running this command you should run svn st doc/ to check if any files were added or removed (? and ! markers). Those files have to be added to svn (all with ?) or removed from svn (those with !, meaning that they were removed because they were not complete enough anymore). Just do so and commit afterwards using svn ci doc/ po/ (the po/pot files as well as the doc files can/should change).

It can happen that files contain errors that are only shown when actually trying to regenerate the doc files. Those have to be fixed in the respective .po files to fix the problems. Doing so requires inspection "by hand" to find the real cause of the problem and fix all occurrences (rather often the case for the man pages with their strange syntax for highlighting).

updating po files for committing

When getting po files for committing them you have to ensure some things:

  • make sure that the files are in the correct lineending (since svn:eol-style is set to "native" for all po-files)
  • run pofix.py on the files to make sure that the latest "tiny typo fixes" are included in the po file and no unnecessary fuzzy strings are created
  • update the files using make LANGCODE.po-update in the textdomains dir
  • check that the files really compile using make LANGCODE.gmo in the textdomains dir (in general already the step before should end with an error, but with this step stats are shown about the file as well as an error if the file is broken)

In general I am using some rather easy system to handle translation updates. It basically consists of an extra folder which basically has all the po files from po/ in their folder structure and nothing else. Let's call this folder $LANGDIR (absolute path used!) from now on. In this folder is a folder for each textdomain (wesnoth/, wesnoth-anl, ...) with all the *.po files for that domain. This folder is basically meant as intermediate step for renaming and all this stuff. Beside this there is also the checkout dir, lets call it $UPDIR (absolute path used!). When receiving an update, the archive (whatever format it is, no matter if .zip, .tar.bz2, .7z or *whatever*) is extracted and the TEXTDOMAIN/ folders copied over to the $LANGDIR. In a worst case situation files have to be copied over by hand because translators might use some crude naming scheme. Those following the naming guideline only require a cp -r wesnoth* $LANGDIR/.

Next step is changing into the folder, ensuring the lineendings (unix is native on my box, thus using dos2unix) and copying the files into the "real" dir. This whole stuff is basically dependent on the single langcode you want to update. In a script LANGCODE can be a plain param. When using bash, you can just use this for loop:

cd $LANGDIR
for i in wesnoth* ; do cd $i; dos2unix LANGCODE.po; cp LANGCODE.po $UPDIR/po/$i/LANGCODE.po;  cd .. ; done

When not using bash, just copy over the files for each dir by hand.

The next step is to update the files with pofix.py, merge against the latest catalog and build for stats/correctness sake:

cd $UPDIR/po/
for i in wesnoth* ; do cd $i; $UPDIR/utils/pofix.py $2.po; make $2.po-update; make $2.gmo; cd .. ; done

When not using bash, do so by hand or write yourself your own script...

When all of the commands were successfull, just commit the files using svn ci, but make sure that the language is already added in changelog as well as players_changelog. Using those steps the doc files are *not* regenerated since regenerating doc files takes quite some time. This is only down every now and then for all the doc files (mainly when a pot-update is run, too).

To sum things up, here all the commands used ($2 is the LANGCODE, commiting after this stuff if no error occurred and everything is as it should be):

echo "executing update script"
echo "switching to "$LANGDIR
cd $LANGDIR
echo "copying po-files"
for i in wesnoth* ; do cd $i; dos2unix $2.po; cp $2.po $UPDIR/po/$i/$2.po;  cd .. ; done
echo "switching to "$UPDIR"/po"
cd $UPDIR/po/
echo "updateing po-files"
for i in wesnoth* ; do cd $i; $UPDIR/utils/pofix.py $2.po; make $2.po-update; make $2.gmo; cd .. ; done
echo "update complete"

adding a new Language

For adding a new language several steps have to be done:

  • add the LANGCODE in the list in all LINGUAS files (po/LINGUAS as well as po/wesnoth*/LINGUAGS )
  • edit po/wesnoth-man*/wesnoth-man*.cfg to have the LANGCODE added here, too
  • generate the po files for the language using this command:
cd po/
for i in wesnoth*; do cd $i; msginit -l LANG_CODE --no-translator; cd ..; done
  • make sure the files really compile (basically check if the plural forms and the rest of the header is fine by trying to compile everything normally; a problematic case is eg "Project-Id-Version: PACKAGE VERSION\n" in the textdomain wesnoth)
  • add data/languages/langcode_COUNTRYCODE.cfg (should look like the other files, too; required to have the file in the lang selection list ingame)
  • add language to data/core/about.cfg (can be a dummy entry for the moment, this is mainly for other langs to have this string among their translateable strings as soon as possible.
  • add all the new files to svn and commit (from the checkouts root):
svn add po/wesnoth*/LANG-CODE.po data/languages/langcode_COUNTRYCODE.cfg
svn ci po/wesnoth*/LANG-CODE.po po/*/LINGUAS po/LINGUAS data/languages/langcode_COUNTRYCODE.cfg data/core/about.cfg
  • update the language list of g.w.o: This requires getting a checkout of branches/resources/gettext.wesnoth.org/public_html/wesnoth-gettext/westats/. Make sure to add the language to langs.php in this checkout, commit and ping someone with ssh access to wesnoth.org to run svn up as the user wesnoth inside $HOME/SOURCE/gettext.wesnoth.org to have the website updated, too.

adding translation files for a new campaign

This is basically the case when importing a new campaign from wescamp. The commands/steps used are:

  • Make sure you got an up to date checkout of wescamp (search the wiki for info about this).
  • cd into the wesnoth checkout dir
  • run this command to import the textdomain from the wescamp repo (the resulting script directly wants to commit, if this is not wanted, comment the respective part out in import_script.sh):
./utils/wescamp_import wescamp-path campaign-name wescamp-textdomain > import_script.sh
./import_script.sh
  • Due to changes in the build system and addition of other build systems, a little more has to be done:
    • copy over po/wesnoth/CMakeLists.txt to po/TEXTDOMAIN/CMakeLists.txt, and add it to svn
    • add the textdomain to po/CMakeLists.txt in the set TRANSLATION_DIRS
    • commit the changed files

If the campaign was not in wescamp before, some other commands will have to be used. Basically the "base files" (as described a lot above) have to be created. Most will be done via copy&paste. Of course the po files have to be created, too. For this to work (after adding the po/TEXTDOMAIN folder as well as the required hooks in configure.ac and po/Makefile.am) autogen.sh has to be run and a pot-update (using make update-po) has to be done in the new textdomains folder to generate po/TEXTDOMAIN/TEXTDOMAIN.pot. Once the .pot file exists, the po files can easily be created with this command:

cd po/TEXTDOMAIN/
for i in `cat LINGUAS`; do msginit -l $i --no-translator; done

Afterwards add all the new files to svn and commit.

Most campaigns that are added have the wrong textdomain to work with "mainline". It should basically work, but there are ways to make it "better". To do so, just use the script utils/change-textdomain should be used. Here are the required commands for switching textdomains:

cd $UPDIR
./utils/change-textdomain campaign-name oldtextdomain newtextdomain

Now edit po/CMakeLists.txt by hand (was not existing when the script was written) to change the textdomain declaration from the old value to the new one. In general files in data/campaign-name/ as well as po/ and configure.ac were altered by this script.

After adding all this new stuff to svn the properties for the files (lineendings) as well as ignores for the folders are not set correctly. To change this, run svn propedit svn:ignore po/TEXTDOMAIN and add this list of files:

CMakeCache.txt
CMakeFiles
cmake_install.cmake
Makefile
Makefile.in
Makefile.in.in
POTFILES
stamp-po
remove-potcdate.sed
*.gmo

Beside this all the files need their svn:eol-style changed. Use these command to change this:

cd po/TEXTDOMAIN
for i in *.po*; do svn propset svn:eol-style "native" $i; done
svn propset svn:eol-style "native" FINDCFG
svn propset svn:eol-style "native" LINGUAS 
svn propset svn:eol-style "native" Makevars 
svn propset svn:eol-style "native" POTFILES.in 
svn propset svn:eol-style "native" remove-potcdate.sin 
svn propset svn:eol-style "native" CMakeLists.txt

Now the po files for the new campaign should exist and all the stuff be ready for usage. Only thing left is adding the textdomain to gettext.wesnoth.org. This requires getting a checkout of branches/resources/gettext.wesnoth.org/public_html/wesnoth-gettext/westats/. Make sure to add the textdomain to config.php in this checkout in the respective $packages = part, commit and ping someone with ssh access to wesnoth.org to run svn up as the user wesnoth inside $HOME/SOURCE/gettext.wesnoth.org to have the website updated, too. With the next run of the update (done every 30mins) the stats for the new domain should appear on the website.

translation related utils

change-textdomain

This script is meant for several purposes and run from the main checkout dir using ./utils/change-textdomain PARAMS. The result depends on PARAMS:

  • -t: check all .cfg files in data/ for a textdomain declaration. If there is none, add #textdomain wesnoth at the top of the files
  • campaign-name oldtextdomain newtextdomain: switches the textdomain for a specific campaign (the name is as in the foldername of data/campaigns/FOLDERNAME). This includes adding all textdomain related things in the campaigns folder as well as moving the folder for the old textdomain to the new textdomain in po/. Many of the files are adjusted, atm only po/CMakeLists.txt has to be adjusted by hand.

po2po

A program to merge strings between various textdomains. In general using this script is *dangerous* since plural forms are *not* handled well.

pofix.py

A program to fix tiny spelling mistages over all textdomains. Ask esr for details (like how to add fixes and such).

sanity-check

Checks if the src files listed in po/wesnoth*/POTFILES.in exist and if all files are listed. For this only the textdomains wesnoth, wesnoth-editor and wesnoth-lib are considered since the other files should be WML only.

wescamp_import

Import the translations of a campaign from wescamp. This requires a wescamp checkout. How this script is used is described further above under adding translation files for a new campaign.

wmlxgettext

This script is used to extract translateable files from WML files. The file in utils/ is written in perl and esr is currently working on a replacement written in python. This is available as data/tools/wmlxgettext and requires some more testing. For usage info, ask esr since he probably knows best how to handle wmlxgettext.

Handling Localized Images

Translators may localize any in-game image they want. These should mostly be images with some English text on them, like main logo, maps, and in-game help screenshots. Localized images come in two varieties:

  • standalone localized images -- localized images which totally replace the original image, when the modified area of the image is great enough (e.g. logo);
  • localized overlay images -- images mostly transparent but for some isolated spots, which are are composed with the original at runtime, when the modified area is too small to waste space using a standalone localized image (e.g. maps).

For translators' view of the process, see the separate article.

Receiving New Images

When translator sends in a localized image, the maintainer should locate its original counterpart, and place the localized image into the l10n/LANG subdirectory of the original image's directory (where LANG is the appropriate language code). E.g. if the original image path is:

foo/bar/image.png

then the standalone localized image will have the same name, and be placed as:

foo/bar/l10n/LANG/image.png

and the overlay image will have --overlay suffix (note two hyphens):

foo/bar/l10n/LANG/image--overlay.png

Then, to register the new localized image with tracking system (see below), the maintainer executes the tracker script in the branch root directory:

$ lbundle-check.py
A         foo/bar/l10n/LANG
A  (bin)  foo/bar/l10n/LANG/image.png
M         l10n-track
--------------------
New 'ok': 1
  ./foo/bar/l10n/LANG/image.png (./foo/bar/image.png 30534)
$

The tracker script will add any new images (and their directories) to version control, and add an entry for this image to the global tracking file l10n-track:

ok        ¦foo/bar/l10n/LANG/image.png¦  5c3e43227fd07391faf441450d623543  30534

Then the maintainer just commits everything (including l10n-track).

Receiving Modified Images

Everything as when a new image is received, except that prior to running tracker script, the line of the image in l10n-track should be manually removed. This will make the tracker script register the modified image as matching the current original, with ok status (it may have been fuzzy before).

In fact, to remove the entry line from l10n-track is strictly necessary only if the old status is fuzzy. But it's good to do it even if status was ok, to get expected feedback when the tracker script is run, signaling proper placement of received image.

Tracking Synchronization to Original

From time to time the original image will have been moved, removed, or modified, and localized images need to follow this change. To detect such conditions, the maintainer periodically (e.g. sufficiently before the impending release) runs the tracker script in root directory of the branch, without any arguments, e.g.:

$ cd $REPOSITORY/trunk/
$ lbundle-check.py

If all localized and original images are in sync, there will be no output.

If an original image was modified, the tracker script will notify about it in output, and change the status of corresponding localized images in l10n-track to fuzzy:

fuzzy     ¦foo/bar/l10n/LANG/image.png¦  5c3e43227fd07391faf441450d623543  30534

The third and fourth field here are the checksum and revision ID of the old original, to which the localized image was synced. In only some images were fuzzied, maintainer can just commit the l10n-track file and be done with it. (Old original revision ID is what translators can use later to compare the changes in the original, modify the localized image accordingly, and send in the updated version.) Alternatively, if in good mood, maintainer could also check in repository log if perhaps the original just got compressed, i.e. there were no visible changes, and manually unfuzzy all the localized images (by removing their lines from l10n-track and rerunning the tracker).

If an original image got moved or removed, the status of corresponding localized images will be set to obsolete. Then the maintainer should examine what happend to the original image: if it got moved, move localized images accordingly, and if it got removed, remove localized images as well. Then, remove obsolete entries from l10n-track, rerun the tracker, and commit moves/removals and modified l10n-track.

The setup for the tracker script is in l10n-spec file in the root directory, next to l10n-track file. It has very few and self-explanatory settings in form of key=value pairs. Only the first of those setting needs to be periodically updated:

 # List of language codes that have at least one localized file.
 languages = sv fr ...

as languages send their first localized images in ((alternatively, just mirror contents of po/LINGUAS as the value)).

This page was last modified on 23 June 2014, at 14:28.