OOS (Out of Sync)

From The Battle for Wesnoth Wiki
Revision as of 04:09, 15 January 2019 by Celtic Minstrel (talk | contribs) (List of Non Mp/Replay safe Wml/Lua functions: Remove duplicated passage from previous edit)

OOS is an error that occurs in a Wesnoth multiplayer scenario or a replay when two machines disagree (or one machine disagrees with a replay file) about the game state, i.e. which units are where, how much HP they have, how much gold each side has, etc. More precisely, OOS is announced when a client reads a command, such as "move Dwarf Fighter 8,4 -> 8,5 -> 8,6" or "recruit Gryphon Rider 9,7", which is illegal or nonsensical based on its understanding of the game state.

OOS could in principle be caused by network issues such as dropped packets, by players having mismatched game files, or even by cheating, but it is commonly caused by scenario designers / add-on makers writing unsafe code. This page is about how to write WML that won't cause OOS. If you want to know what you should do if you get OOS in an online game, see [here].

Why does OOS happen?

Game begins out of sync

At the beginning of an mp game the host sends the content of the [multiplayer] or [scenario] to the other clients (via the mp server). All other information, specially [unit]s [terrain]s and toplevel [lua] tags are not transmitted to other players. Instead it is expected that the other players have this data locally available and that their data matches the data of the host. If that is not the case it can cause OOS.

This can happen if one of the players has modified their game files, or if the players don't have matching add-on versions.

The OOS may still not appear until a mismatched resource actually appears and does something.

Game becomes out of sync

During the actual game all clients have their own local gamesate. The Clients communicate by sending 'player actions' like 'move unit at (3,4) to (5,5) with the route ((3,4),(3,5),(4,5),(5,5))'. These actions are evaluated on the other clients which should lead to the same local gamesate on all clients (asuming that they had the same local gamestate before that action). If at any time clients don't have the same local gamestate we call that OOS. The clients usually send these actions as soon as they know that they cannot be undone. Those actions that are sended to the other clients are called "synced user actions". For example move,recall,recuit ... are such actions. There are also unsynced user action like "select" that only happen on one client and are not sended to other clients. The consequence is that "select" events only run on one client, thus changing the gamestate from inside a select event results in OOS.

A complete list of unsynchronized events can be found here: EventWML#Multiplayer_safety.

The replay works the same way:

the replay begins with the start gamesate and then eveluates all the synced user actions it can find on the replay. If the resulted gamesate is different than the original gamesate we have an OOS. That's why in most times replay safety is the same as multiplayer safety.

Generally speaking a replay file receives the same kind of information about the game that a multiplayer observer of a game would receive, so any technique which would cause OOS for multiplayer will also cause corrupted replays.

If wml/lua code is invoked by a synced user action and thus runs on all clients we say "the code runs in a synced context" otherwise not. (Version 1.13.0 and later only) You can know whether the current code runs in a synced context by checking wesnoth.current.synced_state.

In order to get the same local gamesate on all clients you should only make the gamesate depend on deterministic functions that return the same value on all clients, for example side.cotroller s getter or math.rand is no such function. If you want to call these functions you should use wesnoth.syncronize_choice which allows you to run code on one client and then return the result to all clients so all clients are guaranteed to get the same result. But note that becasue the code in synconize_choice only runs one one client it should only calculate the return value and not change the gamestate.

The rng

Wesnoth has a synced random number generator that returns the same value on all clients, this rng is uses by

  • [set_variable] rand=.. [/set_variable]
  • traits & name generation in [unit]
  • calculation whether an attack hits

To keep the rng in sync it is important to call the rng the same number of times on all clients. Othwewise the random results will be of by one which causes OOS. Luckily the synced rand is smart enough to know whether we are in a synced context. An it redirects to an unsynced rng (like math.random) if it was called from outside a synced context so that calling this rng from outside teh synced context doesnt have an effect on the rng used in the synced context.

This rng gets reseeded at the beginning of every "synced user action", the reason is, that otherwise players could know which random results they'll get next. Expecialy im MP this is very bad since it means players could know whether an attack will hit or not before they command it. In netwroked Mp the Client get their random seeds from the server. This happens layzly that means it doesn't happen at the beginning of the synced user action, but it happens as soon as the rng is used inside the synced context for this action (Especialy it doesn't happen if the rng isn't used). Since sending the "generate seed" to the server cannot be undone, an action that invoked the synced rng (for example attacks, but also moves with a [set_variable] rand=.. in a moveto event) cannot be undone. This is usually the intended behaviour since calling the rng usually implies a information gain that shouldn't be able to be undone. If you want a random number but don't want to make the gamestate depend on it (maybe just use it for visual features), and thus want the action to be undoable you should use the unsynced math.random from lua.

Found dependent command

It's posible to get a OOS eror with the message "Found (in-)dependent command" or siilar here is explained what this means: During the game the clients send each other packages ([command] see http://wiki.wesnoth.org/ReplayWML). There are 2 types of these packages: the 'normal' commands that contains new user actions like attack, recruit, recall, move... . And there are the 'dependent' packages that contains answer to questions asked to specific clients (the results of local choices). For example advancement choices. But also new random seeds (questions asked to the server) and get_global_variable are in this category. If a client received an answer to a local choice when none was expected the games gives a "Found dependent command when is_synced=false" OOS error message. The opposite sitation when a client expectes an answer to a local choice but receives a new user action also gives a OOs message.

Some examples

If Player A changes the White Mage's magic attack from 9-3 to 10-3 while Player B did not, that would result in an OOS since as the game progresses each client would see a different amount of damage being done.

Say that the White Mage attacks a Spearman with 30 HP left and hits all 3 times. To Player A, it would appear that the Spearman died. However Player B would see that unit as having 3 HP left. As a result...

  • What if Player A then tried to move another unit to where the Spearman was? To Player A's client, it would work fine. To Player B's client, that wouldn't make any sense, since it's not possible to have multiple units on the same hex.
  • What if Player B tried to attack one of Player A's units? To Player B's client it seems like a normal move, but to Player A's client it seems like an empty hex is trying to attack him.

Hints To Debug OOS Erros

As noted before, OOS happens when two clients disagree about the current gamestate, So if ina game with clients A, B and C, client A got an OOS during the turn of client B it (often, not always) indicates that client A and B disagree about the gamstate, in this situation client C an often be used to decide which client did the wrong calculations, if client C also got an oos errormessage during the turn of client B, it probably client B that did the wrong calculations, otherwise its probably side A. The most common sources of OOS are:

1. Bad add-ons: in particular you need to be aware of the fact that addons can change the gamestate and cause OOS even if they are not currently active (for example as a modification or a mp scenario ), add-ons can cause OOS by just having them installed even if the seem to be complteley unrelated to the current game.

2. User changing the game/addons files : it often unclear to users which part of the data can be changed and which can not be changes without causing OOS. For example some mp scenarios include lua code via macro inclusion code={myluafile.lua} while others include lua code via dofile/require wesnoth.dofile("myluafile.lua"), in the former case it is usually safe to change the lua files because the hosts sends the other client his version of the lua file along with the other scenario content, while in the later case it fill usually lead to OOS.

3. Outdated wesnoth versions. While we try to keep the wesnoth versions mp compatible, it is still important to keep your wesnoth version up to date. In particular to avoid OOS erros casued by engine bug that were fixed in newer versions, you can use the mp server command /q version <playername> to figure out what wesnoth version another player is using.

4. cheating players: not much to say, player can intentionally change wesnoth game files in a stupid way hoping it would give them an advantage when it will actually just cause OOS.

5. Engine bugs: Wesnoth is not always bugfree, the main diffculty in investigating engine OOS errors is that most of the OOS erros reported actually come from one of the other posibilities above, so its somehow hard to find the 'real' oos engine bugs in the wild. Some features in particular 'delayed shroud updates' and 'multiplayer campaigns' have often casued OOS in the past though and it possible they they will be broken agian in the future due to their complex nature.

How to make your code safe

  • Make the WML run at a synchronized time instead, i.e. in a moveto event.
  • Use helper.rand instead of math.random, unless you know exactly what you are doing.
  • Use Lua wesnoth.synchronize_choice when gathering informaton to make sure that all clients match. Note that this function only works correct in syncronized events.

List of Non Mp/Replay safe Wml/Lua functions

These functions/values might return different values on different clients or in replays. To prevent OOS you must use wesnoth.synchronize_choice or (Version 1.13.0 and later only) [sync_variable] so query these values when you want to change the gamestate depending on these values.

  • Using functions that depend on other installed addons, for example #wesnoth.unit_types will usually be different on each client.
  • Using translatable strings for gamestate calculations, obviously the value of these string depend on the each clients language setting.
  • Unsafe events, see https://wiki.wesnoth.org/EventWML#Multiplayer_safety
  • Unit attributes (accessible via stored units, lua proxy units of unit filters ([filter_wml]))
    • unit.goto_x and unit.goto_y (used by the ai and by multi turn moves internally)
    • unit.facing (in some cases like when unstoring units the unit facing might be set randomly)
    • unit.name (it is possible for players to rename units, also in mp the leaders names changes whenever a side controller changes, also see the point below about any attribute changing the visual appearance.)
    • any attribute describing the visual appearance of that unit (unit.overlays, unit.profile etc.). In default wesnoth they may be be the same on all clients, but people usually assume that they can change the visuals of wesnoth by modifying the cfg files or via [modification]s in add-ons without causing OOS.
  • Variables/wml tags
    • side.controller (gained by [store_side] or wesnoth.sides) this variable will be different for each client. An exception is when controller is "null" which happens if and only if it is "null" on all other clients as well.
    • [set_variable] time=stamp - obviously the result of this operation will be different for all clients
  • Lua functions:
    • math.random()
    • wesnoth.game_config.debug, .version have (possibly) different values
    • any dialog wich queries input from a client (wesnoth.show_dialog)
    • wesnoth.sides[i].controller (same as with [store_side]), wesnoth.sides[i].is_local
    • any non mp-safe wml tag called by lua

Additional tricks and tips

  • You can modify some things. So while you shouldn't change the damage of a unit's attack, there's nothing wrong with changing it's portrait.

See Also

MultiplayerContent