Difference between revisions of "GettextForWesnothDevelopers"
(→General design of gettext use: Drop obsolete stuff about moving to gettext, which happened before Wesnoth 1.0) |
(Reorganise whole text, and add details about generating .po files for UMC) |
||
Line 1: | Line 1: | ||
− | This page is used to help Wesnoth developers to work with the internationalization (i18n) system, based on GNU gettext. | + | This page is used to help Wesnoth developers and UMC authors to work with the internationalization (i18n) system, based on GNU gettext. |
− | + | == General design of gettext use == | |
− | + | Gettextized programs usually contain the English strings within the source code, with calls like ''printf (_("Hello world."));'', so that the binary can work (in English) when the system does not support i18n. | |
− | + | Some strings look the same in English but should not necessarily look identical in translations. To handle this, those strings can be prefixed with any descriptive string and a ''^'' character. | |
− | |||
− | To | ||
− | == | + | === Textdomains === |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | Gettext splits translations in to domains. For Wesnoth, the general idea is to use distinct textdomains for each campaign or add-on, so that UMC authors can easily ship translations together with their campaigns. These domains are covered in more depth in [[GettextForTranslators]]. | |
− | + | The convention is to name each domain using the name of the add-on, or just its initials. For example, ''wesnoth-utbs'' or ''wesnoth-Son_of_Haldric''. For UMC, it probably makes sense to use the full name to ensure that it doesn't clash with another add-on. | |
− | + | === UTF-8 === | |
− | + | For translation, all C++, WML and Lua files should be in UTF-8. As noted in the [[Typography_Style_Guide]], some punctuation should be used that's outside of the ASCII subset. | |
== Marking up strings in C++ == | == Marking up strings in C++ == | ||
Line 73: | Line 59: | ||
=== The textdomain declaration === | === The textdomain declaration === | ||
− | First, your add-on must declare a textdomain. To do this, make sure something like the following is inside of your _main.cfg | + | |
+ | First, your add-on must declare a textdomain. To do this, make sure something like the following is inside of your _main.cfg. This is a top-level tag, so should be outside the ''[campaign]'' or ''[modification]'' tag, it should probably start on the second line of the file (and the next section of this page says what should be on the first line). | ||
<syntaxhighlight lang=wml> | <syntaxhighlight lang=wml> | ||
Line 82: | Line 69: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | + | For choosing the name, see the [[#Textdomains]] section. | |
− | The ''translations'' directory | + | The .po (or .mo) files will be loaded from a subdirectory of the ''translations'' directory. |
=== The textdomain bindings === | === The textdomain bindings === | ||
− | All files with translatable strings must | + | All files with translatable strings must declare which textdomain they use, which is normally done by putting ''#textdomain'' on the first line of each .wml file. See the example below: |
<syntaxhighlight lang=wml> | <syntaxhighlight lang=wml> | ||
Line 99: | Line 86: | ||
[/unit_type] | [/unit_type] | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
Note that it is highly recommended that the first textdomain binding be on the first line of the file. Otherwise, odd stuff may happen. | Note that it is highly recommended that the first textdomain binding be on the first line of the file. Otherwise, odd stuff may happen. | ||
Line 174: | Line 137: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | === | + | === Reusing mainline translations === |
− | + | You can reuse translations for strings in mainline domains by using multiple textdomain bindings: | |
− | Here is an example of a gettext helper file | + | <syntaxhighlight lang=wml> |
+ | # textdomain wesnoth-Son_of_Haldric | ||
+ | |||
+ | [unit_type] | ||
+ | id=Mu | ||
+ | name= _ "Mu" | ||
+ | # ... | ||
+ | |||
+ | [attack] | ||
+ | id=sword | ||
+ | #textdomain wesnoth-units | ||
+ | description= _ "sword" | ||
+ | # ... | ||
+ | [/attack] | ||
+ | |||
+ | #textdomain wesnoth-Son_of_Haldric | ||
+ | # ... | ||
+ | [/unit_type] | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | Of course, if you use bindings for multiple textdomains, make sure the right parts of the file are bound to the right domains. Also, never try to use the mainline campaigns’ domains, for there is no guarantee that the mainline campaigns will be available on all setups. So, only use the core domains: wesnoth, wesnoth-editor, wesnoth-lib, wesnoth-help, wesnoth-test, and wesnoth-units. | ||
+ | |||
+ | ==== The gettext helper file ==== | ||
+ | |||
+ | A gettext helper file is a lovely file that makes reusing mainline translations nice and easy, by having all strings that should use a specific textdomain in a single file. It is also more wmllint-friendly. | ||
+ | |||
+ | Here is an example of a gettext helper file. The macro names start with 'SOH_' to ensure that they don't clash with another add-on's macros (assuming that this add-on is Son_of_Haldric). | ||
<syntaxhighlight lang=wml> | <syntaxhighlight lang=wml> | ||
#textdomain wesnoth-lib | #textdomain wesnoth-lib | ||
− | #define | + | #define SOH_STR_ICE |
_"Ice" #enddef | _"Ice" #enddef | ||
#textdomain wesnoth-units | #textdomain wesnoth-units | ||
− | #define | + | #define SOH_STR_SWORD |
_"sword" #enddef | _"sword" #enddef | ||
</syntaxhighlight> | </syntaxhighlight> | ||
Line 199: | Line 188: | ||
[attack] | [attack] | ||
id=sword | id=sword | ||
− | name={ | + | name={SOH_STR_SWORD} |
# ... | # ... | ||
[/attack] | [/attack] | ||
Line 205: | Line 194: | ||
[terrain_type] | [terrain_type] | ||
id=ice2 | id=ice2 | ||
− | name={ | + | name={SOH_STR_ICE} |
# ... | # ... | ||
[/terrain_type] | [/terrain_type] | ||
</syntaxhighlight> | </syntaxhighlight> | ||
− | == For | + | == Generating the .pot and .po files for UMC == |
+ | |||
+ | The template (.pot) file contains all of the strings that need to be translated in the .po files, but without the translations. | ||
+ | |||
+ | The .pot is generated from WML and Lua files using is a tool called wmlxgettext. On Wesnoth 1.13 and later, it is shipped with Wesnoth itself and can be used from Maintenance tools GUI. Just go to data/tools, launch GUI.pyw and switch to the wmlxgettext tab. | ||
+ | |||
+ | Pre-1.13 instructions on how to get and use it are in Nobun's [https://r.wesnoth.org/p617733 forum posting]. | ||
+ | |||
+ | === Error messages from wmlxgettext === | ||
+ | |||
+ | If you get the error from ''wmlxgettext'' of "UTF-8 Format error. Can't decode byte 0x91 (invalid start byte).", and the line in question has a curly quotation mark, that likely means that your text editor is using the Windows-1252 character set, and you need to replace the Windows quotes with their Unicode equivalents, see [[Typography_Style_Guide]] and your editor's documentation for more info. The same applies if the error message says 0x92, 0x93 or 0x94. | ||
+ | |||
+ | === Generating the .po files for each language === | ||
+ | |||
+ | Continuing with the Son_of_Haldric example, the Swedish translation would be in the file ''data/add-ons/Son_of_Haldric/translations/wesnoth-Son_of_Haldric/sv.po''. | ||
+ | Wesnoth 1.14 (but not 1.12) supports reading .po files directly, so this translation should appear as soon as you refresh the cache. | ||
+ | |||
+ | Each .po file can start as a simple copy of the .pot file. Either the author or the translator makes a copy for each language, and then the work of [[GettextForTranslators]] happens on those copies. | ||
+ | |||
+ | Some .po editors, for example poedit, will recognise that the .pot is a template, and automatically suggest saving to a different filename. The poedit editor can also update a .po file based on changes to the .pot file. | ||
+ | |||
+ | === Generating the .mo files for UMC === | ||
+ | |||
+ | For Wesnoth 1.14, it's generally not necessary to compile the .po files to .mo files. The mainline translations still use .mo files for better performance, but UMC authors can skip the .mo compilation stage. | ||
+ | |||
+ | == Possibly obsolete parts of this page == | ||
+ | |||
+ | === How to move strings from one textdomain to another === | ||
+ | |||
+ | Warning: this section is very outdated (but I'd like someone else to comment on whether it's still useful). | ||
+ | |||
+ | * run ''make -C po update-po'' and commit, to be sure to only commit your own changes | ||
+ | * move the file into the corect po/*/POTFILES.in | ||
+ | * add or change ''#define GETTEXT_DOMAIN "wesnoth-lib"'' at top of the file, before the includes | ||
+ | * update the target POT file to include the new strings in its template (eg. ''make -C po/wesnoth-editor | ||
+ | wesnoth-editor.pot-update'') | ||
+ | * copy the translations using utils/po2po (eg. ''./utils/po2po wesnoth wesnoth-editor'') | ||
+ | * update the source POT file to get rid of the old strings (eg. ''make -C po/wesnoth update-po''), then preferably | ||
+ | remove the translation from obsolete strings in all languages, to make sure, in case the strings have to move back, | ||
+ | that | ||
+ | any translation update gets used instead of the current one) | ||
+ | * check ''cvs diff'' and commit | ||
=== How to prepare translation updates for being committed === | === How to prepare translation updates for being committed === |
Revision as of 02:18, 18 May 2019
This page is used to help Wesnoth developers and UMC authors to work with the internationalization (i18n) system, based on GNU gettext.
Contents
General design of gettext use
Gettextized programs usually contain the English strings within the source code, with calls like printf (_("Hello world."));, so that the binary can work (in English) when the system does not support i18n.
Some strings look the same in English but should not necessarily look identical in translations. To handle this, those strings can be prefixed with any descriptive string and a ^ character.
Textdomains
Gettext splits translations in to domains. For Wesnoth, the general idea is to use distinct textdomains for each campaign or add-on, so that UMC authors can easily ship translations together with their campaigns. These domains are covered in more depth in GettextForTranslators.
The convention is to name each domain using the name of the add-on, or just its initials. For example, wesnoth-utbs or wesnoth-Son_of_Haldric. For UMC, it probably makes sense to use the full name to ensure that it doesn't clash with another add-on.
UTF-8
For translation, all C++, WML and Lua files should be in UTF-8. As noted in the Typography_Style_Guide, some punctuation should be used that's outside of the ASCII subset.
Marking up strings in C++
In C++, you can mark up strings for translations using the _("A translation")
and _n("Translation", "Translations", int)
macros. The _n
macro is to be used if the string has a singular and plural form.
If the string contains any placeholders, do not use snprintf
. Use vgettext
instead, or vngettext
for any int placeholders.
You can also add comments for translators directly above the string - use the keyword TRANSLATORS:
for that. The comment must be placed in the line immediately above the translateable string, like this:
int handfuls = 2; const std::string translated_text = vngettext( // TRANSLATORS: Yum! "$handfuls handful of $taste potatoes", "$handfuls handfuls of $taste potatoes", handfuls, utils::string_map({ {"handfuls", handfuls}, {"taste", "yummy"} }));
The following code will not work for including the comment:
int handfuls = 2; // TRANSLATORS: Yuck! const std::string translated_text = vngettext( "$handfuls handful of $taste potatoes", "$handfuls handfuls of $taste potatoes", handfuls, utils::string_map({ {"handfuls", handfuls}, {"taste", "yucky"} }));
You can also use multiline comments:
int handfuls = 2; const std::string translated_text = vngettext( /* TRANSLATORS: Yum! Best potatoes ever! */ "$handfuls handful of $taste potatoes", "$handfuls handfuls of $taste potatoes", handfuls, utils::string_map({ {"handfuls", handfuls}, {"taste", "yummy"} }));
Marking up strings in WML
The textdomain declaration
First, your add-on must declare a textdomain. To do this, make sure something like the following is inside of your _main.cfg. This is a top-level tag, so should be outside the [campaign] or [modification] tag, it should probably start on the second line of the file (and the next section of this page says what should be on the first line).
[textdomain]
name="wesnoth-Son_of_Haldric"
path="data/add-ons/Son_of_Haldric/translations"
[/textdomain]
For choosing the name, see the #Textdomains section.
The .po (or .mo) files will be loaded from a subdirectory of the translations directory.
The textdomain bindings
All files with translatable strings must declare which textdomain they use, which is normally done by putting #textdomain on the first line of each .wml file. See the example below:
#textdomain wesnoth-Son_of_Haldric
[unit_type]
id=Mu
name= _ "Mu"
# ...
[/unit_type]
Note that it is highly recommended that the first textdomain binding be on the first line of the file. Otherwise, odd stuff may happen.
The translatable strings
To mark a string as translatable, just put an underscore ( _ ) in front of the string you wish to be marked as translatable, like the example below:
name= _ "Mu"
Note that there are certain things you should never do. For example, never mark an empty string as translatable, for wmlxgettext (the tool that extracts strings from WML) will abort upon detecting one. Therefore, what is seen below should never be done:
name= _ ""
Also, never put macro arguments in a translatable string, for it will not work. The reason for this is that the preprocessor does its job before gettext, thus gettext will try to replace a string that does not exist. Therefore, what is shown below should not be done:
name= _ "{TYPE} Mu"
To show why it will not work:
#define UNIT_NAME TYPE
name= _ "{TYPE} Mu"
#enddef
{UNIT_NAME ( _ "Sword")}
{UNIT_NAME ( _ "Bow")}
Translation catalogues would have this: "{TYPE} Mu", therefore gettext will look for it even though it will not exist because we, in fact, have these after the preprocessor is done:
name= _ "Sword Mu"
name= _ "Bow Mu"
Since those are not in the catalogues, they will not get translated.
If you think a translatable string needs additional guidance to be translated properly, you can provide a special comment that will be seen by the translators. Just begin the comment with '#po:' above the string in question:
#po: "northern marches" is *not* a typo for "northern marshes" here.
#po: In archaic English, "march" means "border country".
story=_ "The orcs were first sighted from the north marches of the great forest of Wesmere."
Reusing mainline translations
You can reuse translations for strings in mainline domains by using multiple textdomain bindings:
# textdomain wesnoth-Son_of_Haldric
[unit_type]
id=Mu
name= _ "Mu"
# ...
[attack]
id=sword
#textdomain wesnoth-units
description= _ "sword"
# ...
[/attack]
#textdomain wesnoth-Son_of_Haldric
# ...
[/unit_type]
Of course, if you use bindings for multiple textdomains, make sure the right parts of the file are bound to the right domains. Also, never try to use the mainline campaigns’ domains, for there is no guarantee that the mainline campaigns will be available on all setups. So, only use the core domains: wesnoth, wesnoth-editor, wesnoth-lib, wesnoth-help, wesnoth-test, and wesnoth-units.
The gettext helper file
A gettext helper file is a lovely file that makes reusing mainline translations nice and easy, by having all strings that should use a specific textdomain in a single file. It is also more wmllint-friendly.
Here is an example of a gettext helper file. The macro names start with 'SOH_' to ensure that they don't clash with another add-on's macros (assuming that this add-on is Son_of_Haldric).
#textdomain wesnoth-lib
#define SOH_STR_ICE
_"Ice" #enddef
#textdomain wesnoth-units
#define SOH_STR_SWORD
_"sword" #enddef
A typical name for gettext helper files is mainline-strings.cfg.
To use it, just wire it into your add-on and use the macros:
[attack]
id=sword
name={SOH_STR_SWORD}
# ...
[/attack]
[terrain_type]
id=ice2
name={SOH_STR_ICE}
# ...
[/terrain_type]
Generating the .pot and .po files for UMC
The template (.pot) file contains all of the strings that need to be translated in the .po files, but without the translations.
The .pot is generated from WML and Lua files using is a tool called wmlxgettext. On Wesnoth 1.13 and later, it is shipped with Wesnoth itself and can be used from Maintenance tools GUI. Just go to data/tools, launch GUI.pyw and switch to the wmlxgettext tab.
Pre-1.13 instructions on how to get and use it are in Nobun's forum posting.
Error messages from wmlxgettext
If you get the error from wmlxgettext of "UTF-8 Format error. Can't decode byte 0x91 (invalid start byte).", and the line in question has a curly quotation mark, that likely means that your text editor is using the Windows-1252 character set, and you need to replace the Windows quotes with their Unicode equivalents, see Typography_Style_Guide and your editor's documentation for more info. The same applies if the error message says 0x92, 0x93 or 0x94.
Generating the .po files for each language
Continuing with the Son_of_Haldric example, the Swedish translation would be in the file data/add-ons/Son_of_Haldric/translations/wesnoth-Son_of_Haldric/sv.po. Wesnoth 1.14 (but not 1.12) supports reading .po files directly, so this translation should appear as soon as you refresh the cache.
Each .po file can start as a simple copy of the .pot file. Either the author or the translator makes a copy for each language, and then the work of GettextForTranslators happens on those copies.
Some .po editors, for example poedit, will recognise that the .pot is a template, and automatically suggest saving to a different filename. The poedit editor can also update a .po file based on changes to the .pot file.
Generating the .mo files for UMC
For Wesnoth 1.14, it's generally not necessary to compile the .po files to .mo files. The mainline translations still use .mo files for better performance, but UMC authors can skip the .mo compilation stage.
Possibly obsolete parts of this page
How to move strings from one textdomain to another
Warning: this section is very outdated (but I'd like someone else to comment on whether it's still useful).
- run make -C po update-po and commit, to be sure to only commit your own changes
- move the file into the corect po/*/POTFILES.in
- add or change #define GETTEXT_DOMAIN "wesnoth-lib" at top of the file, before the includes
- update the target POT file to include the new strings in its template (eg. make -C po/wesnoth-editor
wesnoth-editor.pot-update)
- copy the translations using utils/po2po (eg. ./utils/po2po wesnoth wesnoth-editor)
- update the source POT file to get rid of the old strings (eg. make -C po/wesnoth update-po), then preferably
remove the translation from obsolete strings in all languages, to make sure, in case the strings have to move back, that any translation update gets used instead of the current one)
- check cvs diff and commit
How to prepare translation updates for being committed
To ensure that the diffs the version-control system generates are usable and that the po files are actually compilable, it is recommended that i18n and translation managers follow these steps before committing translation updates.
Note that this guide assumes that you are using a Unix-like system and the CMake build system.
1. Run dos2unix on all of the updated po files.
2. Run "make po-update-<locale>" to ensure the updated po files are in sync with their corresponding pot files and to fix the line wrapping.
3. Run "make mo-update-<locale>" to ensure that the updated po files are compilable.
How to update the translation catalogs
Running a .pot
update with CMake is documented in Releasing Wesnoth -> General Maintenance.
If you are using scons, run scons pot-update
instead.