Difference between revisions of "GettextForTranslators"

From The Battle for Wesnoth Wiki
(Dollar signs: Be consistent about terminology)
(Markup: Turn that question into a paragraph)
Line 134: Line 134:
  
 
Like HTML, help markup consists of tags enclosed in angle brackets (eg <code><italic></code>), with a slash added to the closing tag. The difference lies in what comes between the tags, which takes a key=value format where the value is usually enclosed in single quotes. There can be multiple key=value pairs, separated by spaces. Generally, all of that should be left untouched, with the exception of the <code>text</code> key. If you see something like <code><italic>text='some text'</italic></code>, the only part that should be translated is "some text". Or if you see <code><nowiki><ref>dst='movement' text='zones of control'</ref></nowiki></code>, then only "zones of control" should be translated, and the rest should be left untouched.
 
Like HTML, help markup consists of tags enclosed in angle brackets (eg <code><italic></code>), with a slash added to the closing tag. The difference lies in what comes between the tags, which takes a key=value format where the value is usually enclosed in single quotes. There can be multiple key=value pairs, separated by spaces. Generally, all of that should be left untouched, with the exception of the <code>text</code> key. If you see something like <code><italic>text='some text'</italic></code>, the only part that should be translated is "some text". Or if you see <code><nowiki><ref>dst='movement' text='zones of control'</ref></nowiki></code>, then only "zones of control" should be translated, and the rest should be left untouched.
* '''How do I use ' within a single-quote delimited string when translating text= in help screen texts?'''
+
 
** Add a backslash before it, however the preferred method now is to use [[Typography_Style_Guide#Character_Usage_Summary|typographic punctuation]] instead.
+
If you need to include a single quote within <code>text='...'</text>, you can add a backslash before it. However, most of the time it's preferred to use [[Typography_Style_Guide#Character_Usage_Summary|typographic punctuation]] instead.
  
 
==  FAQ  ==
 
==  FAQ  ==

Revision as of 03:12, 22 February 2024

Gettext for translators

For the engine and mainline campaigns

The target audience of this page is anyone who wants to help with a language that's already in the list on WesnothTranslations. The files for these languages have already been set up, someone is already the maintainer, and the pages linked from that page say where to go and how to introduce yourself to the team. The effort is split with separate teams for each language, and each team can have its own working style.

Because each team can work in different ways, please do not submit translations as pull requests on Github. While that seems like a good idea, it causes potential complications, so we ask that everyone talks to their language's maintainer and submits changes in the way that the maintainer decides to use.

If you're starting a completely new translation, or taking over as the maintainer of a translation, then the next place to read would be the WesnothTranslationsHowTo page.

For add-ons

The target audience is anyone who wants to help translate an add-on, assuming the add-on's maintainer has already set it up for translation. Translations are shipped with the add-on, so it's up to the maintainer to choose how to receive the files and organise the workflow.

To translate an add-on, skip to the Files for add-ons section.

If you are the maintainer of the add-on, the instructions for setting up the translation are on GettextForWesnothDevelopers.

Textdomains and getting the files to translate

The progress for each language is shown on https://www.wesnoth.org/gettext/ . Click on your language, and you'll see a breakdown into sections (textdomains), such as

  • wesnoth
  • wesnoth-editor
  • wesnoth-help
  • wesnoth-units
  • wesnoth-lib (contains strings shared by game and editor)
  • wesnoth-httt (the Heir to the Throne campaign)
  • wesnoth-utbs (the Under the Burning Suns campaign)
  • etc

Each has a separate file with a .po extension. For example, the Swedish translation has abbreviation sv, and its translation of the editor's strings is in wesnoth-editor/sv.po. The page on https://www.wesnoth.org/gettext/?view=langs&version=branch&lang=sv links to the current version in the main Git repository.

Although the .po files contain text, please send the complete files as email attachments (or whichever method your team uses), rather than cutting and pasting lines from the file into an email. The translated strings are very sensitive to formatting and whitespace changes.

Files when running the game

When the game runs it will look for an .mo file, for example translations/sv/LC_MESSAGES/wesnoth-editor.mo. If you want to test your text in-game and you're happy to modify your installation:

  • Some .po editors can automatically generate .mo files.
  • Deleting the .mo file will make the game look for translations/wesnoth-editor/sv.po instead.

However you can also send your untested .po file to the maintainer and they should check that it looks correct in-game.

Files for add-ons

This assumes the add-on has .po files rather than .mo files. The engine supports both, but only .po are editable. For example, an add-on called Son of Haldric that has a Swedish translation would likely store it in:

  • data/add-ons/Son_of_Haldric/translations/wesnoth-Son_of_Haldric/sv.po .

That comes from:

  • data/add-ons/Son_of_Haldric is where all files from the add-on are stored
  • translations/wesnoth-Son_of_Haldric is configured by the maintainer
  • sv is the language code for Swedish. The codes for each language are given in the big table on https://www.wesnoth.org/gettext/ .

If it's been set up for translation but hasn't yet been translated into Swedish, it may instead have a file called wesnoth-Son_of_Haldric.pot. This is a template which you can copy to "sv.po", however first try opening the .pot file directly in your translation tool, as the it will likely prompt you to create a translation from the template.

After editing the file, either refresh the cache (press F5 on the title screen) or restart Wesnoth to see the changes.

Warning: files in the add-on directory will be overwritten or deleted if it's updated using the in-game Add-on Manager, so keep backups of files in a separate directory.

How to translate

Now that you have the .po file to edit, it can be worked on using any of the programs listed in the Tools section. The general preference in Wesnoth seems to be towards Poedit, but they all work on the same .po file format.

The general concept is that the GUI will show the strings in English, along with a text-box to add the translation.

Hints

In addition to the English text, most strings have some additional information which in Poedit will be shown in the bottom-right of the screen. For example, the tutorial's "... this quintain!" has "[message]: speaker=Delfador", meaning that it's said by the unit with id "Delfador". Sometimes the hint is less obvious, often lines will be "[message]: speaker=unit", normally meaning the unit whose move, attack, or death has triggered an event.

Some strings will have additional hints, such as "Addressing Konrad" or "Addressing Li'sar". The level of detail varies a lot between campaigns, please don't be afraid to point out when something needs to be improved.

The .po files also say which source file the string came from, in Poedit 3.4 this is found in a right-click menu (right-click on a string in the top-left pane of the GUI).

Carets

There are ambiguous strings which should be translated in a different way depending on where they appear. For example, we have "General" in the preferences as "General preferences" and we can also have "a General". These strings can have different translations for a given language, so we use "context" to solve this. The prefix only tries to give a hint about the string, and should be not translated, for example:

msgid "Prefs section^General"
msgstr "General"

As another example, these lines in the tutorial both have the hint "[message]: speaker=student"; they're spoken by Konrad and Li'sar respectively:

  • "A quintain? You want me to fight a dummy?"
  • "female^A quintain? You want me to fight a dummy?"

When an English string has a caret (the ^ symbol) in it, then everything before the first caret is removed before showing it to the player. In the translated strings, leave out everything before the caret.

These caret hints are very commonly used in Wesnoth for strings where the translation may depend on the gender of the speaker, or the person spoken to. They're also used for strings such as "Prefs section^General", where the string is the label on the "general preferences" tab, not a military rank.

If you forget to strip out the caret hint, it will be displayed to the user unless the entire string is identical to the source text. This means that it should be stripped out even if the text is otherwise identical, to avoid breaking it if it's later determined that the text needs to be changed. For example:

# wrong but seems to work (shows as "Root")
msgid "filesystem_path_system^Root"
msgstr "filesystem_path_system^Root"
# wrong and shows as "filesystem_path_system^Wurzel"
msgid "filesystem_path_system^Root"
msgstr "filesystem_path_system^Wurzel"

Warnings

Some editors can automatically detect inconsistencies between the English and the translated text, for example if the original ends with a full-stop and the translation ends with an exclamation mark. Poedit defaults to showing these above even the untranslated strings, but these can be false-positives - generally someone's already looked at these and decided that the translated text is better as-is.

Fuzzy strings

One of the downsides of Gettext is that spelling and grammar corrections in the English text break the link between the original and the translated text. The tools that generate .po files try to recover from this by using the old translation for the new English text, and marking the string as fuzzy; in Poedit these are sorted below the completely untranslated strings, and shown with the "Needs Work" button lit along with a note about what the previous English text was.

Be wary, this mechanism can also generate incorrect suggestions. For example it may decide that "Landar left $number troops to guard the council" is a spelling correction of "Kalenz left $number troops to guard the council".

Parts to leave untranslated

Some translatable strings mix text to be translated with placeholders that are handled by the engine before displaying the text to the player. The text of the placeholder needs to be left untranslated.

Dollar signs

Most placeholders start with a dollar sign ($), for example, $number or $gold_amount. The details are in SyntaxWML, but for translation the general rules are:

Characters that appear in placeholders are letters, numbers, square brackets, underscores, full stops and question marks. Square brackets always appear in balanced pairs, and often enclose a secondary (nested) placeholder.

  • After a dollar sign, every character in the list above is part of the placeholder.
  • A space ends the placeholder, and the space is also displayed to the player.
  • A vertical bar (|) ends the placeholder, and the vertical bar isn't displayed to the player.
  • If you need to put a full stop or question mark immediately after a placeholder that doesn't end with a vertical bar, add a vertical bar to separate them. Although it's not always necessary, it never hurts to add the vertical bar to be on the safe side.
  • A dollar sign at the end of the string, or directly followed by a vertical bar ($|), is displayed as a dollar sign.
  • Obscure cases should have a hint about the string.

Markup

Some strings contain formatting markup. This comes in two styles. The most common is Pango markup, which is an HTML-like formatting language. The other is the custom help page markup, which looks superficially similar but is rather different, and is documented here.

Like HTML, help markup consists of tags enclosed in angle brackets (eg <italic>), with a slash added to the closing tag. The difference lies in what comes between the tags, which takes a key=value format where the value is usually enclosed in single quotes. There can be multiple key=value pairs, separated by spaces. Generally, all of that should be left untouched, with the exception of the text key. If you see something like <italic>text='some text'</italic>, the only part that should be translated is "some text". Or if you see <ref>dst='movement' text='zones of control'</ref>, then only "zones of control" should be translated, and the rest should be left untouched.

If you need to include a single quote within text='...'</text>, you can add a backslash before it. However, most of the time it's preferred to use typographic punctuation instead.

FAQ

  • What are "Plural-forms"?
    • Some languages have different word forms for different numbers of things (for example in English we have "1 thing" but "2 things"). The rules are different for different languages. You can find them here.
  • Who can I ask for further information?
    • You can ask in Discord or IRC. Ping Ivanovic in Discord's #development or IRC's #wesnoth-dev channel. If you don't like IRC, send a mail to crazy-ivanovic AT gmx DOT net, or pm him (ivanovic) at the forum.
  • Why is the diff from the previous version so huge? I have only made a small change to the .po file with poedit.
    • When saving .po file poedit unwraps all strings. Usually, all .po files are wrapped at 80 characters so if you want smaller diffs and less merge conflicts you can execute the following commands each time after editing with poedit:
  msgattrib file.po > file.po1
  mv file.po1 file.po

Tools

There are several tools to work with .po files:

Of course, you can edit po files with any UTF-8 capable text editor, but the tools listed above have great advantages over any text editor regarding .po translation, like going to next fuzzy/untranslated string, searching only in specific fields (msgid, msgstr, comment), ...

See Also