GrammarWML

From The Battle for Wesnoth Wiki
Revision as of 08:35, 21 November 2016 by Celtic Minstrel (talk | contribs) (Separate out preprocesor grammar)

This page contains a formal grammar of the Wesnoth domain-specific languages, including WML and its preprocessor. It does not attempt to capture any of the ways that the Wesnoth engine may interpret a string, such as WML variable substitution. It also doesn't fully capture the potential consequences of macros, for example the use of unbalanced WML tags. The syntax used is regular-expression-like (which is not quite the same as regex-like!), with the following conventions:

  • Literal values are enclosed in either 'single quotes' or "double quotes".
  • Square brackets enclose character classes, with initial ^ inverting them
  • Whitespace within an expression (unless quoted) is used only for readability or to separate non-terminals
  • The meta-characters * + ? | have the same meaning as is typical in regular expressions
  • The sequence «tab» represents a tab character, and «nl» represents an end-of-line character or character sequence
  • Multiple definitions of a non-terminal are equivalent to alternation (ie, x:=4 and x:=7 combine to produce x:=4|7)

WML Preprocessor

The WML preprocessor knows little of the grammar of the WML language itself; it is primarily just a text-substitution engine. Currently this is just a draft and may not be entirely accurate.

preproc_doc := (preproc_directive | preproc_line)*
preproc_directive := simple_directive | macro_definition | if_block
preproc_line := (preproc_text | '<<' macro_free_text '>>' | macro_inclusion)* comment? «nl»
preproc_text := (preproc_char | '<' preproc_char)* '<'?
preproc_char := [^<{#«nl»]
macro_free_text := (macro_free_char | '>' macro_free_char)*
macro_free_char := [^>]
macro_inclusion := '{' ([^}]+ | macro_function) '}'
macro_function := macro_name_char+ (macro_argument)*
macro_name_char := [^} «tab»]
macro_argument := (macro_name_char | macro_inclusion)+
macro_argument := '(' preproc_doc? ')' | '_'? '"' ([^}"]
macro_argument := '""' | macro_inclusion)* '"'
macro_argument := '<<' macro_free_text '>>'
comment := '#' [^«nl»]+ «nl»
ws := ' ' | «tab»
simple_directive := '#undef' ws+ macro_name_char+ ws* «nl»
simple_directive := ('#warning' | '#error') ws+ [^«nl»]* «nl»
macro_definition := '#define' ws+ macro_name_char+ (ws+ macro_name_char+)* «nl» (simple_directive | if_block | preproc_line)+ '#enddef'
if_block := (ifdef_header | ifver_header | ifhave_header) «nl» preproc_doc ('#else' «nl» preproc_doc)? '#endif'
ifdef_header := ('#ifdef' | '#ifndef') ws+ macro_name_char+
ifver_header := ('#ifver' | '#ifnver') ws+ macro_name_char+ ws* comparison_op ws* version_string
ifhave_header := ('#ifhave' | '#ifnhave') ws+ [^«nl»]+
comparison_op := '<' | '<=' | '==' | '!=' | '>=' | '>'
version_string := integer ('.' integer)*
integer := [0-9]+

WML

ws := ' ' | <tab>
ws_eol := whitespace* comment? <nl>
id := [a-zA-Z0-9_]+
wml_document := (wml_attribute | wml_tag | macro_inclusion | ws* ws_eol)*
wml_attribute := ws* wml_key_sequence ws* '=' ws* wml_attribute_value ws_eol
wml_key_sequence := id (ws* ',' ws* id)*
wml_attribute_value := (text | ('_' ws*)? string) (ws* '+' ws_eol wml_attribute_value)
text := ([^ <tab>+"]+ ws*)*
string := '"' ([^"] | '""')* '"'
wml_tag := ws* '[' '+'? id ']' ws_eol wml_document ws* '[' '/' id ']'
comment := '#' [^<nl>]* <nl>