Difference between revisions of "Translation Maintenance Commands"
|  (add some information about the translation maintainance workflow) | 
| (No difference) | 
Revision as of 15:48, 13 April 2009
Contents
Foreword
This page is meant to document the current commands used for updating translation files that are sent in. Beside this it also is meant to list steps required to add new campaigns (importing from wescamp and changing textdomains) as well as new languages.
This version is based on the stuff required as of 1.6.1. It is based on autotools/configure usage atm but should be easily addaptable to cmake and scons, too.
Prerequisites
- svn checkout of Wesnoth with commit privs (eg svn+ssh://USERNAME@svn.gna.org/svn/wesnoth/branches/1.6/ )
- all the autotools foo: as required for building with autotools, too; basically autoconf and automake; when relying on a different build system cmake or scons are required
- gettext: required for the game anyway, but some gettext tools are used like msg*whatever* (don't know exactly which, magically done by the buildsystem...)
- po4a: required for documentation generation like manual and manpages, not for normal updates
- docbook: required to generate the manual, not for normal updates; not sure which parts of docbook at needed, these are the ones I have currently installed on my gentoo box:
- app-text/build-docbook-catalog-1.4
- app-text/docbook-xml-dtd-4.2-r2
- app-text/docbook-xml-dtd-4.4-r1
- app-text/docbook-xml-dtd-4.5
- app-text/docbook-xsl-stylesheets-1.74.0
 
- a terminal, preferably with bash available, commands posted here work nicely with bash and are easily put into a shellscript to automatize things.
After getting the svn checkout, run ./autogen.sh in the root dir of the checkout to make sure that all the links and make targets are created. So basically to do so a complete and working build environment has to exist.
Folder structure and files relevant for translations
Stuff relevant for translations is basically in po/. In there all the translation catalogs as well as the translation files and some "helper files" are placed.
In po/ each subfolder stands for a different textdomain:
- the folder wesnoth/ stands for the textdomain wesnoth (includes many basic things) * the folder wesnoth-utbs/ stands for the textdomain wesnoth-utbs and includes all the strings only relevant for the campaign Under the Burning Suns
- ...
In each folder several files are available:
TEXTDOMAIN.pot
Catalog with all the original strings. It includes no translated string, a copy of this file is basically what the gettext tools use to start a new translation. This is the file that is basically updated in a so called pot-update. Against this catalog file all po files are merged when they are updated, only English strings in this file determine what has to be in the po-files as "original string". The file is named after the folder it is in.
*.po
These are the "real" translation files. They are named after the language they belong to, based on the "langcodes" for those. Eg the file for the german translation is named de.po. With these files translators do work and from these the .mo files are created when "compiling" them. The files basically consist of a header, an area of "active strings" (starting right after the header) and and area used for "old and now unused" strings (commented out using lines starting with #~. All lines starting with a # are commented out, though there are various kinds of comments. These comments are defined by gettext, here a short explaination about them and their syntax, basically the beginning of the lines defines the type:
- #. My Text: My Text is some automatically extracted comment from the sources (translation hints, status stuff, all this foo).
- #, fuzzy: This string was seen as "reasonably close but not identical" when the file was updated. Translators should check this string and remove this line if the translation is correct. Unless this is done, the string is not shown translated in the program. This is the complete content of the line
- #: path/to/file:1: File reference. This give a hint on where in the files this string was extracted from. Basically a filename:linenumber syntax is used. Only one file per line is listed, but several of those lines can be listed per string.
- # My comment: My comment is a "Usercreated comment". Basically a comment the translator added. Anything the translator wants can be in here...
- #~: Stuff following after #~ is not active in the corresponding .pot file. Those are mainly old strings that were removed and are left over for the case that it gets back again into the game.
In general translations in the po file consist over (several) of those comments and two other blocks:
- msgid "Original String": the original string used that the translators have to work on.
- msgstr "Translated Version": the string that is used as replacement when the translation this file belongs to is used and a case of "Original String" is meat (where the original has to be marked as translateable!).
In general each of those strings can span over several lines, but to have a newline in the translation (or original) an explicit \n has to be added. The only constant thing is that strings have to be encapsulated by " and only the stuff inside of the " counts. Additionally to those "normal forms" the strings can also exists as plurals, here an example from the game and a corresponding German translation ([0] means singular, [1] means first plural form, [2] means 2nd plural form and so on):
#: src/actions.cpp:2485 msgid "Friendly unit sighted" msgid_plural "$friends friendly units sighted" msgstr[0] "Verbündete Einheit gesichtet" msgstr[1] "$friends verbündete Einheiten gesichtet"
One more thing users should be aware of: The translated and original strings have to start *and* end with the same number of \n. A mismatch in those numbers is the most common case for errors.
LINGUAS
This file includes a space separated list of all langcodes supported. As of 1.6.1 this is the current content of LINGUAGS (changes when new langs are added or old ones are removed):
af ar bg ca ca_ES@valencia cs da de el en_GB es eo et eu fi fr fur_IT gl he hr hu id is it ja ko la lt lv mk mr nl nb_NO pl pt pt_BR racv ro ru sk sl sr sr@latin sv tl tr zh_CN zh_TW
A LINGUAS file exists in every subfolder of po/. The one right in po/ is used by scons and cmake, the one in each textdomain dir is used by autotools (no guarantee that this is really correct, but it should be...). Basically this is used to have a list of files that have to be compiled and such when building the game.
FINDCFG
This file in each textdomain dir includes the find term used to find all WML files relevant for this domain. Main task is to reduce time when running a pot-update so that not all files are checked, but only the relevant ones. This is mainly used for campaigns to only check the campaigns folder and stuff like this. Example:
find data/core/units -name '*.cfg' -print find data/campaigns/*/units -name '*.cfg' -print
POTFILES.in
In this file in each textdomain dir all the c++ files are listed that should belong to this domain. Basically it is just a plain list of the files and nothing more.
src/config_cache.cpp src/construct_dialog.cpp src/filechooser.cpp src/font.cpp src/game_preferences.cpp src/game_preferences_display.cpp
Makevars
This file in each textdomain dir is for some varibales used by autotools when generating files. Don't ask for what exactly it is, in general it has not to be touched by humans at all. The only thing you should make sure is to add correct entries for textdomains in the DOMAIN = lines.
CMakeLists.txt
Providing information for cmake to generate the required commands. Identical for basically all textdomains.
remove-potcdate.sin
Some strange file available in each textdomain dir. Used by the tools, no idea how it is done and if it is really needed.
Generated files
Basically all the others files not listed here are generated by the build systems (or specific for documentation/po4a related stuff, described below). That especially includes *.mo or *,gmo files which include the "compiled" translations in a form that gettext can directly use them ingame, too.
Documentation/po4a specific stuff
Makefile
po4a based textdomain only (normally autogenerated by autotools)
A hardcoded makefile for po4a stuff. This should eventually be replaced by a generated file from eg cmake. Maybe ettin knows more about this stuff...
TEXTDOMAIN.cfg
po4a based textdomain only
A file with some po4a based config, also duplicating some stuff that normally would be available eg in LINGUAS. This should eventually be changed in some way, too. ATM it is there because, hmm, it is there... Some po4a specific variables are configured here, maybe ettin knows more about this stuff...
manpages
To manpages are placed in doc/man/. Currently the originals are wesnoth.6 and wesnothd.6. Translations are generated from po/wesnoth-manpages/ and placed in doc/man/LANGCODE/ after they are generated using po4a. When new files are generated, they have to be added to svn ( svn add doc/man/*/*.6 ) and also in the autotools based makefile to have them installed (at the top of doc/man/Makefile.am ). Manpages are only created if at least 80% of the strings of the respective manpages are translated.
html manual
The base file for the manual is doc/manual/manual.txt. Using the po4a command stuff together with docbook the files doc/manual/manual.LANGCODE.html are generated. After generation of the files, they have to be added to svn, too ( svn add doc/manual/manual.*.html ). Translations can also come with own images, those should be placed in doc/manual/images/LANGOCE/ .
pot-update
In a pot-update all the reference files (po/wesnoth*/*.pot) are updated and the respective .po files are updated against those reference files. The commands I use to update all pot and the according .po files is:
cd $CHECKOUT svn up cd po/ make update-po
While doing so look out for any error messages from the gettext tools. Afterwards just commit stuff with a plain "svn ci". The "svn up" is only needed to be sure that all files are up to date (the ones with the original strings as well as the translation files).
regenerating doc files
To update the po files for documentation and generate the according translated manpages and manuals, use these commands:
cd $CHECKOUT svn up make update-po4a
Be aware that this requires po4a as well as docbook and takes considerable time. After running this command you should run svn st doc/ to check if any files were added or removed (? and ! markers). Those files have to be added to svn (all with ?) or removed from svn (those with !, meaning that they were removed because they were not complete enough anymore). Just do so and commit afterwards using svn ci doc/ po/ (the po/pot files as well as the doc files can/should change).
It can happen that files contain errors that are only shown when actually trying to regenerate the doc files. Those have to be fixed in the respective .po files to fix the problems. Doing so requires inspection "by hand" to find the real cause of the problem and fix all occurrences (rather often the case for the man pages with their strange syntax for highlighting).
updating po files for committing
When getting po files for committing them you have to ensure some things:
- make sure that the files are in the correct lineending (since svn:eol-style is set to "native" for all po-files)
- run pofix.py on the files to make sure that the latest "tiny typo fixes" are included in the po file and no unnecessary fuzzy strings are created
- update the files using make LANGCODE.po-update in the textdomains dir
- check that the files really compile using make LANGCODE.gmo in the textdomains dir (in general already the step before should end with an error, but with this step stats are shown about the file as well as an error if the file is broken)
In general I am using some rather easy system to handle translation updates. It basically consists of an extra folder which basically has all the po files from po/ in their folder structure and nothing else. Let's call this folder $LANGDIR (absolute path used!) from now on. In this folder is a folder for each textdomain (wesnoth/, wesnoth-anl, ...) with all the *.po files for that domain. This folder is basically meant as intermediate step for renaming and all this stuff. Beside this there is also the checkout dir, lets call it $UPDIR (absolute path used!). When receiving an update, the archive (whatever format it is, no matter if .zip, .tar.bz2, .7z or *whatever*) is extracted and the TEXTDOMAIN/ folders copied over to the $LANGDIR. In a worst case situation files have to be copied over by hand because translators might use some crude naming scheme. Those following the naming guideline only require a cp -r wesnoth* $LANGDIR/.
Next step is changing into the folder, ensuring the lineendings (unix is native on my box, thus using dos2unix) and copying the files into the "real" dir. This whole stuff is basically dependent on the single langcode you want to update. In a script LANGCODE can be a plain param. When using bash, you can just use this for loop:
cd $LANGDIR for i in wesnoth* ; do cd $i; dos2unix LANGCODE.po; cp LANGCODE.po $UPDIR/po/$i/LANGCODE.po; cd .. ; done
When not using bash, just copy over the files for each dir by hand.
The next step is to update the files with pofix.py, merge against the latest catalog and build for stats/correctness sake:
cd $UPDIR/po/ for i in wesnoth* ; do cd $i; $UPDIR/utils/pofix.py $2.po; make $2.po-update; make $2.gmo; cd .. ; done
When not using bash, do so by hand or write yourself your own script...
When all of the commands were successfull, just commit the files using svn ci, but make sure that the language is already added in changelog as well as players_changelog. Using those steps the doc files are *not* regenerated since regenerating doc files takes quite some time. This is only down every now and then for all the doc files (mainly when a pot-update is run, too).
To sum things up, here all the commands used ($2 is the LANGCODE, commiting after this stuff if no error occurred and everything is as it should be):
echo "executing update script" echo "switching to "$LANGDIR cd $LANGDIR echo "copying po-files" for i in wesnoth* ; do cd $i; dos2unix $2.po; cp $2.po $UPDIR/po/$i/$2.po; cd .. ; done echo "switching to "$UPDIR"/po" cd $UPDIR/po/ echo "updateing po-files" for i in wesnoth* ; do cd $i; $UPDIR/utils/pofix.py $2.po; make $2.po-update; make $2.gmo; cd .. ; done echo "update complete"
adding a new Language
For adding a new language several steps have to be done:
- add the LANGCODE in the list in all LINGUAS files (po/LINGUAS as well as po/wesnoth*/LINGUAGS )
- edit po/wesnoth-man*/wesnoth-man*.cfg to have the LANGCODE added here, too
- generate the po files for the language using this command:
cd po/ for i in wesnoth*; do cd $i; msginit -l LANG_CODE --no-translator; cd ..; done
- make sure the files really compile (basically check if the plural forms and the rest of the header is fine by trying to compile everything normally; a problematic case is eg "Project-Id-Version: PACKAGE VERSION\n" in the textdomain wesnoth)
- add data/languages/langcode_COUNTRYCODE.cfg (should look like the other files, too; required to have the file in the lang selection list ingame)
- add language to data/core/about.cfg (can be a dummy entry for the moment, this is mainly for other langs to have this string among their translateable strings as soon as possible.
- add all the new files to svn and commit (from the checkouts root):
svn add po/wesnoth*/LANG-CODE.po data/languages/langcode_COUNTRYCODE.cfg svn ci po/wesnoth*/LANG-CODE.po po/*/LINGUAS po/LINGUAS data/languages/langcode_COUNTRYCODE.cfg data/core/about.cfg
- update the language list of g.w.o: This requires getting a checkout of branches/resources/gettext.wesnoth.org/public_html/wesnoth-gettext/westats/. Make sure to add the language to langs.php in this checkout, commit and ping someone with ssh access to wesnoth.org to run svn up as the user wesnoth inside $HOME/SOURCE/gettext.wesnoth.org to have the website updated, too.
adding translation files for a new campaign
This is basically the case when importing a new campaign from wescamp. The commands/steps used are:
- Make sure you got an up to date checkout of wescamp (search the wiki for info about this).
- cd into the wesnoth checkout dir
- run this command to import the textdomain from the wescamp repo (the resulting script directly wants to commit, if this is not wanted, comment the respective part out in import_script.sh):
./utils/wescamp_import wescamp-path campaign-name wescamp-textdomain > import_script.sh ./import_script.sh
- Due to changes in the build system and addition of other build systems, a little more has to be done:
- copy over po/wesnoth/CMakeLists.txt to po/TEXTDOMAIN/CMakeLists.txt, and add it to svn
- add the textdomain to po/CMakeLists.txt in the set TRANSLATION_DIRS
- commit the changed files
 
If the campaign was not in wescamp before, some other commands will have to be used. Basically the "base files" (as described a lot above) have to be created. Most will be done via copy&paste. Of course the po files have to be created, too. For this to work (after adding the po/TEXTDOMAIN folder as well as the required hooks in configure.ac and po/Makefile.am) autogen.sh has to be run and a pot-update (using make update-po) has to be done in the new textdomains folder to generate po/TEXTDOMAIN/TEXTDOMAIN.pot. Once the .pot file exists, the po files can easily be created with this command:
cd po/TEXTDOMAIN/ for i in `cat LINGUAS`; do msginit -l $i --no-translator; done
Afterwards add all the new files to svn and commit.
Most campaigns that are added have the wrong textdomain to work with "mainline". It should basically work, but there are ways to make it "better". To do so, just use the script utils/change-textdomain should be used. Here are the required commands for switching textdomains:
cd $UPDIR ./utils/change-textdomain campaign-name oldtextdomain newtextdomain
Now edit po/CMakeLists.txt by hand (was not existing when the script was written) to change the textdomain declaration from the old value to the new one. In general files in data/campaign-name/ as well as po/ and configure.ac were altered by this script.
After adding all this new stuff to svn the properties for the files (lineendings) as well as ignores for the folders are not set correctly. To change this, run svn propedit svn:ignore po/TEXTDOMAIN and add this list of files:
CMakeCache.txt CMakeFiles cmake_install.cmake Makefile Makefile.in Makefile.in.in POTFILES stamp-po remove-potcdate.sed *.gmo
Beside this all the files need their svn:eol-style changed. Use these command to change this:
cd po/TEXTDOMAIN for i in *.po*; do svn propset svn:eol-style "native" $i; done svn propset svn:eol-style "native" FINDCFG svn propset svn:eol-style "native" LINGUAS svn propset svn:eol-style "native" Makevars svn propset svn:eol-style "native" POTFILES.in svn propset svn:eol-style "native" remove-potcdate.sin svn propset svn:eol-style "native" CMakeLists.txt
Now the po files for the new campaign should exist and all the stuff be ready for usage. Only thing left is adding the textdomain to gettext.wesnoth.org. This requires getting a checkout of branches/resources/gettext.wesnoth.org/public_html/wesnoth-gettext/westats/. Make sure to add the textdomain to config.php in this checkout in the respective $packages = part, commit and ping someone with ssh access to wesnoth.org to run svn up as the user wesnoth inside $HOME/SOURCE/gettext.wesnoth.org to have the website updated, too. With the next run of the update (done every 30mins) the stats for the new domain should appear on the website.
change-textdomain
This script is meant for several purposes and run from the main checkout dir using ./utils/change-textdomain PARAMS. The result depends on PARAMS:
- -t: check all .cfg files in data/ for a textdomain declaration. If there is none, add #textdomain wesnoth at the top of the files
- campaign-name oldtextdomain newtextdomain: switches the textdomain for a specific campaign (the name is as in the foldername of data/campaigns/FOLDERNAME). This includes adding all textdomain related things in the campaigns folder as well as moving the folder for the old textdomain to the new textdomain in po/. Many of the files are adjusted, atm only po/CMakeLists.txt has to be adjusted by hand.
po2po
A program to merge strings between various textdomains. In general using this script is *dangerous* since plural forms are *not* handled well.
pofix.py
A program to fix tiny spelling mistages over all textdomains. Ask esr for details (like how to add fixes and such).
sanity-check
Checks if the src files listed in po/wesnoth*/POTFILES.in exist and if all files are listed. For this only the textdomains wesnoth, wesnoth-editor and wesnoth-lib are considered since the other files should be WML only.
wescamp_import
Import the translations of a campaign from wescamp. This requires a wescamp checkout. How this script is used is described further above under adding translation files for a new campaign.
wmlxgettext
This script is used to extract translateable files from WML files. The file in utils/ is written in perl and esr is currently working on a replacement written in python. This is available as data/tools/wmlxgettext and requires some more testing. For usage info, ask esr since he probably knows best how to handle wmlxgettext.