WesnothdDesign
Contents
Overview
This page documents the design of wesnothd, Wesnoth's multiplayer server. This document was written after I (Dave) re-designed wesnothd to try and make it more efficient.
Wesnothd is a network server written in C++. It shares some of Wesnoth's code, in particular the networking code. Wesnothd manages requests from Wesnoth, that are sent in WML. It manages a lobby of players who can chat with each other, create games, join games, observe games, and so forth. Wesnothd maintains a list of games, users, and so forth. Wesnothd largely operates by relaying messages to the appropriate clients. For instance, if a player in a game makes a move, wesnothd will relay this to all clients who are participating in the game. Wesnothd may also perform appropriate filtering and so forth, and allows administrators to administer a game by kicking or banning users and so forth.
simple_wml
Wesnothd originally used the same WML module, the config object, and serialization code as Wesnoth. This allowed code reuse and easy interoperability. However, Wesnoth's WML module is designed for ease of use by coders, and supports rich functionality, parsing user-written WML. This approach proved to be inefficient for wesnothd. config objects store all name/value pairs in attributes and element names as std::string objects. This typically means a dynamically allocated buffer for each and every name and value and element name. This leads to slow parsing as well as high memory use.
Since wesnothd uses WML in a relatively limited way, a specialized WML parser and in-memory representation was developed, called simple_wml. Simple_wml can parse only a subset of WML: it assumes that all attribute values are surrounded by quotes, and that attributes are in lexicographical (alphabetical) order. This allows very quick parsing.
Additionally, the in-memory representation has some limitations. It cannot perform near the same number of operations as config objects can.
Documents vs nodes
A config object is both a WML document and a node within a document. Simple_wml is different: there is a document object and a node object. A document automatically has a root node, and a node may create child nodes. Nodes may not be transferred between documents (except by swapping documents; see below).
Nodes all track their parents, as well as the document that owns them. This approach allows for several important optimizations.
The string_span class
Simple_wml defines a string_span class. A string_span is a portion of a string: a pointer to a buffer, and the size of the buffer. A string_span is not seen as owning the string it points into. Rather, it is an immutable 'view' of a part of a string.
All attribute names and values and element names in simple_wml nodes are stored as string_spans. This often allows allocation to be minimized or avoided.
Ownership semantics
A document contains a vector with a list of all buffers it is considered to 'own'. These are buffers its nodes depend on (i.e. string_spans within the nodes refer to within these buffers). When the document is destroyed it will free these buffers.
A document may be constructed by a text buffer or a gzip compressed buffer. The caller may transfer control of the buffer to the document. If it does not, the document will make a copy of the buffer. If the buffer is compressed, the document will make an uncompressed version. The buffer will be added as an owned buffer.
The document will then parse the text buffer, creating nodes. All the nodes will initialize their string_spans for element and attribute values pointing into the text buffer being parsed.
This means that when a WML document is parsed, it will only allocate the node objects. All strings will simply point into the buffer for the document.
Of course, nodes within a WML document might be modified programmatically. Nodes have several methods for setting an attribute, which each have different ownership semantics for the buffers being set:
set_attr(const char* name, const char* value): this function will take two C-style strings, and refer to them with string_spans. The strings are NOT copied. Rather, it is the responsibility of the caller to ensure that these buffers will remain valid throughout the lifetime of the document. This is most often ensured by using data in static memory as input. For instance, you might call some_node.set_attr("has_object, "1"); Because you pass string literals in, you know the memory is guaranteed to be valid through the life of the program.
This method is very efficient, since no allocation takes place
set_attr_dup(const char* name, const char* value): this function will take two C-style strings. The name will be treated as in set_attr, but a copy will be made of the value. The buffer the copy is made in will be stored as an owned buffer of the document, and so will be destroyed when the document is destroyed.
A copy of the name is not made, since having a dynamically generated name is not common, and usually one can use a string literal.
set_attr_dup_name_and_value(const char* name, const char* value): This function works as above, but duplicates both the name and value.
set_child(const char* name): this function creates a new child node and returns a reference to it. The name will be treated as in set_attr(), with no copy being made; it should be in static memory.
It is also possible to duplicate a buffer and make it owned by the document manually. The document class contains a dup_string() method which duplicates a string buffer and stores it as owned by the document.
Cool simple_wml optimizations
Simple_wml allows for some very nice optimizations. When a document is constructed by parsing, every node contains a string_span referring to the portion of the buffer that it was parsed from. Every node also keeps track of whether it has been modified since parsing or not. When a node is modified, it walks up its parents and marks each one as 'dirty' -- i.e. modified.
When a simple_wml document is output, it will look and see if the root node is dirty. If the root node is not dirty, then the document has not been modified since parsing. We still have access to the buffer that the document was parsed from, so we can simply spit it back out, and do no work in emitting the document.
Additionally, each node contains the portion of the buffer it was parsed from. If the document was modified, then nodes that are dirty will output themselves, but nodes that are not dirty will simply output the portion of the buffer they were parsed from.
Also, when outputting, we generate a full, compact text document. The string_spans in the nodes will, after outputting themselves to the output buffer, be reassigned to point into the output buffer, instead of where they were pointing before. This means that after output is complete, all of our string_spans will be pointing into the output buffer. This means that all other buffers are no longer needed, and will be released by the document.
A simple_wml document can also be queried for its compressed (gzipped) form, ready for sending over the network. If it is parsed from a gzipped buffer, it will keep a reference to this. If the document is not dirty, it will return this gzipped buffer. If it does not have the buffer, or is dirty, then it will generate the text output, compress it, and also cache the gzipped version.
Documents support a compress operation which will store a gzipped buffer of the document, and delete the text buffer and all the nodes and any other buffers. The document will automatically re-construct the nodes if the root node is ever queried for in code. This is very useful if one wants to store a document and perhaps transmit it over the network but it's unlikely that further operations will be performend on the document.
Documents support a swap() operation that allows the contents of two documents to be swapped. This is relatively inexpensive since it simply twiddles pointers. It is very useful, for instance, to populate a long-held document containing level data for a game with the game's definition sent by a client.
This all implies that simple_wml is very efficient at being used to store WML portions sent by clients, and then re-transmit to other clients.
simple_wml vs regular WML
Performance tests show simple_wml to be MUCH faster than WML used in Wesnoth. Parsing a WML document is between 10 and 30 times faster with simple_wml. Of course, benefits of simple_wml include smart optimizations to try and avoid parsing and outputting at all when unnecessary.
Update Dec 2020
All times were averaged over 1000 runs using a debug (-O0) build. simple_wml's clone()'s results were identical to creating a new simple_wml document.
Parsing from WML text:
- config - 1619 ms
- simple_wml::document; INIT_STATIC - 269 ms
- simple_wml::document; INIT_COMPRESSED - 353 ms
Config move constructor:
- 431 ms
Config move constructor:
- 131 ms
It should be noted though that simple_wml is 'simple' in that it is a stripped down, simplified version of WML. It itself is NOT 'simple' to use, and should never be considered for use in main Wesnoth code. It is very complicated to use, and should not be touched at all except by experienced C++ developers.
The network layer
The Wesnoth network layer originally transmitted only config objects. One called send_data with a config object, and the network layer would serialize it and send it to the given socket. Unfortunately this is rather inefficient from a server's perspective, not least because it means that the config object will have to be serialized and compressed for each client. Additionally, of course, it ties one to config objects.
The network layer was thus given additional 'raw' send functions, which allow one to send a raw buffer. Wesnothd uses this to get the gzipped version of documents and send them. Additionally, a flag was added to the network layer to make it receive in 'raw' mode too, allowing one to get a vector<char> containing a buffer which would then be used to construct a simple_wml document.
Sharding the network layer
The network layer was also refactored to make it scale better. The network layer uses blocking socket operations to send and receive data. Of course, it is intractable for the server to block when communicating with a client, so these operations are performed in worker threads. When one sends data, one places the data in a queue and sends a signal to a worker thread which wakes up and processes the queue. A pool of worker threads operate on the queue. Similarly this pool of threads handles incoming requests, using select() to block until data is available.
This approach created performance problems with the queue that would have to be accessed from the main thread and from this pool of worker threads. A mutex would have to be locked, and many times threads would block on this mutex, all having to access the same queue.
This problem was alleviated by sharding the system: A #define NUM_SHARDS was introduced. The recommended value for this macro is 7 when compiling wesnothd. Instead of one queue and mutex, an array of NUM_SHARDS queues and mutexes is kept. Every socket is hashed into one of these shards based on its memory address. There are also NUM_SHARDS pools of worker threads available. When the main thread queues data, it only has to lock the mutex for the shard it is accessing, meaning worker threads on other shards are not blocked. Similarly for receiving data. This greatly reduces contention amongst threads.