This page gives an in-depth specification of the DomTool language. Most members would probably prefer the more informal presentation in DomTool/UserGuide.
Contents
1. Source code
For a complete, precise, and accurate grammatical specification, see the lexer and parser specifications src/domtool.lex and src/domtool.grm in the DomTool source code. See src/tycheck.sml for the type-checker implementation. DomTool/Building has information on obtaining the source.
2. Token conventions
In the grammars that follow, we use these lexical token class names:
Name |
Description |
Int |
Integer constant |
String |
String constant (enclosed in double quotes) |
Symbol |
Identifier starting with a lowercase letter |
CSymbol |
Identifier starting with a capital letter |
3. Predicates
DomTool uses predicates to describe in what contexts an action may occur. For instance, web-related actions should only occur inside the scope of a virtual host directive. Predicates are built up following the grammar in the table below, using the letter P as the non-terminal for predicates.
Meanings are given as statements that must hold about the context where an action is found. The context is represented as a stack of context IDs which have been declared with context declarations.
Syntax |
Description |
Meaning |
Root |
Root |
The stack is empty. |
CSymbol |
Context ID |
CSymbol is on the top of the stack. |
^P |
Suffixes |
Some (not necessarily strict) suffix of the stack matches P. |
!P |
Not |
The stack doesn't match P. |
P1 & P2 |
And |
The stack matches both P1 and P2. |
(P) |
Grouping |
Identical to P |
4. Types
Types describe expressions. As is standard in statically-typed programming languages, they are used only for validation purposes and have no real effect on the "output" of a program. The following table gives the grammar of types T. The section on expressions will give the meanings of types in terms of which expressions have which types.
Syntax |
Description |
Symbol |
Extern type |
[T] |
List of Ts |
T1 -> T2 |
Function from T1 to T2 |
[P] |
Action allowed only when P is satisified; requires no environment variables on input and writes none of its own |
[P] {CSymbol1 : T1, ..., CSymbolN : TN} |
Action that requires environment variables CSymbol1, ..., CSymbolN to have the given types when run |
[P] {CSymbol1_1 : T1_1, ..., CSymbol1_N : T1_N} => {CSymbol2_1 : T2_1, ..., CSymbol2_M : T2_M} |
Like the last case, but the second set of typed environment variables describes what the action will write |
P => T |
A nested action that requires that its nested configuration satisfy P; T should be some action type |
(T) |
Grouping |
5. Expressions
Here is the grammar of expressions E. As is standard in ML-family languages and Haskell, juxtaposition is used to represent function application, with application associating to the left.
Syntax |
Description |
Typing |
Int |
Integer constant |
G |- Int : int |
String |
String constant |
G |- String : string |
[E1, ..., EN] |
List |
If G |- Ei : T for each Ei, then G |- [E1, ..., EN] : [T]. |
Symbol |
Variable |
G1, Symbol : T, G2 |- Symbol : T. |
E1 E2 |
Application |
If G |- E1 : T1 -> T2 and G |- E2 : T1, then G |- E1 E2 : T2. |
\ Symbol -> E |
Abstraction (inferred domain type) |
If G, Symbol : T1 |- E : T2, then G |- \ Symbol -> E : T1 -> T2. |
\ Symbol : (T1) -> E |
Abstraction (explicit domain type) |
If G, Symbol : T1 |- E : T2, then G |- \ Symbol : (T1) -> E : T1 -> T2. |
CSymbol = E |
Environment variable set |
See subsection on actions |
Symbol <- CSymbol; E |
Environment variable get |
See subsection on actions |
E1; E2 |
Sequencing |
See subsection on actions |
E1 where E2 end |
Local bindings |
See subsection on actions |
let E1 in E2 end |
Local bindings |
See subsection on actions |
E1 with E2 end |
Nested action |
See subsection on actions |
E1 with end |
Empty nested action |
See subsection on actions |
E1 where E2 with E3 end |
Nested action with local bindings |
See subsection on actions |
E1 where E2 with end |
Empty nested action with local bindings |
See subsection on actions |
\\ Symbol : P -> E |
Nested action abstraction |
See subsection on actions |
(E) |
Grouping |
Same as E |
if E then E1 else E2 end |
Conditional |
G |- E : bool; G |- E1 : T1; G |- E2 : T2; T1 is a subtype of T2 or vice-versa |
5.1. Actions
The DomTool language is purely functional. Like Haskell, it uses a monad to inject effectful operations into its pure core. For DomTool, this is the action monad. This monad merges two potentially separate features that tend to occur together.
First, actions are used to run code that will affect the outside world and lead to changes in the configuration of real daemons. Primitive actions like domain and vhost are the building-blocks here. They are defined by plugins. The other action forms in the table above are there just to allow the proper composition and sequencing of applications of primitive actions, which do the real work. See DomTool/Implementation for how the code to run for a particular primitive action is registered with the implementation.
Second, the action monad provides the functionality of environment variables. These are similar to UNIX environment variables, but DomTool maintains its own environment where each variable has a static type. The rationale for including environment variables in the language is that, while many actions are highly configurable, you usually only want to tweak a few of their options at a time. DomTool allows an ambient environment of default variable settings, and it provides language constructs for modifying certain variables both globally and locally.
5.1.1. Effects of the action expressions on the environment
Syntax |
Effect |
CSymbol = E |
Environment variable CSymbol is set to the value of E, with E's type. |
Symbol <- CSymbol; E |
Environment variable CSymbol is read into normal variable Symbol, which inherits its type/value. |
E1; E2 |
The effect of E1 followed by the effect of E2 |
E1 where E2 end |
The effect of E2 followed by E1, afterward erasing any environment variable alterations by E2 |
let E1 in E2 end |
Same as E2 where E1 end |
E1 with E2 end |
The effect of E1 followed by E2 |
E1 with end |
Same as E1 |
E1 where E2 with E3 end |
The effect of E2 followed by E1 followed by E3, afterward erasing any environment variable alterations by E2 |
E1 where E2 with end |
The effect of E2 followed by E1, afterward erasing any environment variable alterations by E2 |
\\ Symbol : P -> E |
No effect until called |
E1 with E2 end, when E1 is a nested abstraction function |
The effect of E1 followed by E2 followed by the effect of the action obtained by substituting the value of E2 in the body of the abstraction to which E1 evaluates. |
5.1.2. Nested action functions
Sometimes it is convenient to be able to write new nested actions that call primitive nested actions as subroutines. For instance, the dom helper function uses domain as a subroutine. Standard functions aren't good enough for these purposes, since they don't allow us to take into account the different environment effects that different nested action arguments might have. The nested function type P => T is the solution to this problem.
You can define a nested action function with the \\ Symbol : P -> E form. Such a function has type P => T when assuming that Symbol has type [P] implies that E has type T. Any call to this function will be typed taking into account that we play the argument action's effect before playing the effect of the function's body.
5.2. Extern types
Extern types (which are also used to implement "primitive types" like int and string) have something of the flavor of dependent types or refinement types. Registering appropriate handlers from a plugin can create an extern type whose values are controlled by an arbitrary predicate. Plugin implementers are supposed to maintain the invariant that the predicate controlling an extern type is never observably inconsistent in the course of a single type-checking session. That is, it never once declares a value to belong to type T and also declares it not to belong to T in the same type-checker invocation. A good example is the your_domain type, which consists of those strings naming domains that the current user is allowed to configure. Within a single type-checking, the user remains constant, and so your_domain's predicate returns consistent decisions. On the other hand, across different sessions by different users, the predicate will of course make different decisions. This approach allows extern types to be used for flexible type-level enforcement of security policies.
It would be a pain for users to have to mark exactly which extern type a certain expression is meant to belong to. Instead, the DomTool language implementation uses slightly non-compositional type-checking to make use of rich extern types more convenient. By "non-compositional", I mean that the value of an expression, not just its type, may matter in type-checking a larger expression that it is found within. This non-compositionality is only used to infer when a value was given a base type like string when it really ought to have been given some rich extern type. For instance, when a function is called that expects an argument of type your_domain, the argument is first type-checked. If its inferred type is your_domain, then all is well. If not, then we try again, passing the actual argument expression to your_domain's controlling predicate. Only if that predicate also rejects the expression do we signal a type error. No simplifications are performed on an expression before checking its form; for instance, a global variable's definition won't be unfolded to check if the value satisfies a required predicate.
6. Declarations
Declarations D add new symbols to the typing environment.
Syntax |
Description |
Effect on typing environment |
extern type Symbol |
Extern type |
Register Symbol as a new extern type that is either defined in a plugin or treated purely syntactically. |
extern val Symbol : T |
Extern value |
Register Symbol as a new variable that is either defined in a plugin or treated purely syntactically. |
val Symbol = E |
Expression synonym |
Define Symbol as an abbreviation for E; a specific type for the binding is inferred and used at future occurrences. |
val Symbol : T = E |
Expression synonym |
Define Symbol as an abbreviation for E of type T. |
context CSymbol |
Context ID |
Declare a new context ID CSymbol. |
7. Source files
A source file is an optional documentation string, followed by a sequence of semicolon-terminated declarations with optional trailing documentation strings, followed by an optional expression. A documentation string is any text between {{ and }} delimiters and may contain HTML.
Documentation strings are used in automatic HTML documentation generation. A documentation string that starts a file is used to describe that file in the module index, and it's also included at the start of that file's page. A documentation string after a declaration is used to describe it in the detail section of its file's page. The standard library documentation shows an example output.