DomTool/LanguageReference

This page gives an in-depth specification of the DomTool language. Most members would probably prefer the more informal presentation in DomTool/UserGuide.

1. Source code

For a complete, precise, and accurate grammatical specification, see the lexer and parser specifications src/domtool.lex and src/domtool.grm in the DomTool source code. See src/tycheck.sml for the type-checker implementation. DomTool/Building has information on obtaining the source.

2. Token conventions

In the grammars that follow, we use these lexical token class names:

Name

Description

Int

Integer constant

String

String constant (enclosed in double quotes)

Symbol

Identifier starting with a lowercase letter

CSymbol

Identifier starting with a capital letter

3. Predicates

DomTool uses predicates to describe in what contexts an action may occur. For instance, web-related actions should only occur inside the scope of a virtual host directive. Predicates are built up following the grammar in the table below, using the letter P as the non-terminal for predicates.

Meanings are given as statements that must hold about the context where an action is found. The context is represented as a stack of context IDs which have been declared with context declarations.

Syntax

Description

Meaning

Root

Root

The stack is empty.

CSymbol

Context ID

CSymbol is on the top of the stack.

^P

Suffixes

Some (not necessarily strict) suffix of the stack matches P.

!P

Not

The stack doesn't match P.

P1 & P2

And

The stack matches both P1 and P2.

(P)

Grouping

Identical to P

4. Types

Types describe expressions. As is standard in statically-typed programming languages, they are used only for validation purposes and have no real effect on the "output" of a program. The following table gives the grammar of types T. The section on expressions will give the meanings of types in terms of which expressions have which types.

Syntax

Description

Symbol

Extern type

[T]

List of Ts

T1 -> T2

Function from T1 to T2

[P]

Action allowed only when P is satisified; requires no environment variables on input and writes none of its own

[P] {CSymbol1 : T1, ..., CSymbolN : TN}

Action that requires environment variables CSymbol1, ..., CSymbolN to have the given types when run

[P] {CSymbol1_1 : T1_1, ..., CSymbol1_N : T1_N} => {CSymbol2_1 : T2_1, ..., CSymbol2_M : T2_M}

Like the last case, but the second set of typed environment variables describes what the action will write

P => T

A nested action that requires that its nested configuration satisfy P; T should be some action type

(T)

Grouping

5. Expressions

Here is the grammar of expressions E. As is standard in ML-family languages and Haskell, juxtaposition is used to represent function application, with application associating to the left.

Syntax

Description

Typing

Int

Integer constant

G |- Int : int

String

String constant

G |- String : string

[E1, ..., EN]

List

If G |- Ei : T for each Ei, then G |- [E1, ..., EN] : [T].

Symbol

Variable

G1, Symbol : T, G2 |- Symbol : T.

E1 E2

Application

If G |- E1 : T1 -> T2 and G |- E2 : T1, then G |- E1 E2 : T2.

\ Symbol -> E

Abstraction (inferred domain type)

If G, Symbol : T1 |- E : T2, then G |- \ Symbol -> E : T1 -> T2.

\ Symbol : (T1) -> E

Abstraction (explicit domain type)

If G, Symbol : T1 |- E : T2, then G |- \ Symbol : (T1) -> E : T1 -> T2.

CSymbol = E

Environment variable set

See subsection on actions

Symbol <- CSymbol; E

Environment variable get

See subsection on actions

E1; E2

Sequencing

See subsection on actions

E1 where E2 end

Local bindings

See subsection on actions

let E1 in E2 end

Local bindings

See subsection on actions

E1 with E2 end

Nested action

See subsection on actions

E1 with end

Empty nested action

See subsection on actions

E1 where E2 with E3 end

Nested action with local bindings

See subsection on actions

E1 where E2 with end

Empty nested action with local bindings

See subsection on actions

\\ Symbol : P -> E

Nested action abstraction

See subsection on actions

(E)

Grouping

Same as E

if E then E1 else E2 end

Conditional

G |- E : bool; G |- E1 : T1; G |- E2 : T2; T1 is a subtype of T2 or vice-versa

5.1. Actions

The DomTool language is purely functional. Like Haskell, it uses a monad to inject effectful operations into its pure core. For DomTool, this is the action monad. This monad merges two potentially separate features that tend to occur together.

First, actions are used to run code that will affect the outside world and lead to changes in the configuration of real daemons. Primitive actions like domain and vhost are the building-blocks here. They are defined by plugins. The other action forms in the table above are there just to allow the proper composition and sequencing of applications of primitive actions, which do the real work. See DomTool/Implementation for how the code to run for a particular primitive action is registered with the implementation.

Second, the action monad provides the functionality of environment variables. These are similar to UNIX environment variables, but DomTool maintains its own environment where each variable has a static type. The rationale for including environment variables in the language is that, while many actions are highly configurable, you usually only want to tweak a few of their options at a time. DomTool allows an ambient environment of default variable settings, and it provides language constructs for modifying certain variables both globally and locally.

5.1.1. Effects of the action expressions on the environment

Syntax

Effect

CSymbol = E

Environment variable CSymbol is set to the value of E, with E's type.

Symbol <- CSymbol; E

Environment variable CSymbol is read into normal variable Symbol, which inherits its type/value.

E1; E2

The effect of E1 followed by the effect of E2

E1 where E2 end

The effect of E2 followed by E1, afterward erasing any environment variable alterations by E2

let E1 in E2 end

Same as E2 where E1 end

E1 with E2 end

The effect of E1 followed by E2

E1 with end

Same as E1

E1 where E2 with E3 end

The effect of E2 followed by E1 followed by E3, afterward erasing any environment variable alterations by E2

E1 where E2 with end

The effect of E2 followed by E1, afterward erasing any environment variable alterations by E2

\\ Symbol : P -> E

No effect until called

E1 with E2 end, when E1 is a nested abstraction function

The effect of E1 followed by E2 followed by the effect of the action obtained by substituting the value of E2 in the body of the abstraction to which E1 evaluates.

5.1.2. Nested action functions

Sometimes it is convenient to be able to write new nested actions that call primitive nested actions as subroutines. For instance, the dom helper function uses domain as a subroutine. Standard functions aren't good enough for these purposes, since they don't allow us to take into account the different environment effects that different nested action arguments might have. The nested function type P => T is the solution to this problem.

You can define a nested action function with the \\ Symbol : P -> E form. Such a function has type P => T when assuming that Symbol has type [P] implies that E has type T. Any call to this function will be typed taking into account that we play the argument action's effect before playing the effect of the function's body.

5.2. Extern types

Extern types (which are also used to implement "primitive types" like int and string) have something of the flavor of dependent types or refinement types. Registering appropriate handlers from a plugin can create an extern type whose values are controlled by an arbitrary predicate. Plugin implementers are supposed to maintain the invariant that the predicate controlling an extern type is never observably inconsistent in the course of a single type-checking session. That is, it never once declares a value to belong to type T and also declares it not to belong to T in the same type-checker invocation. A good example is the your_domain type, which consists of those strings naming domains that the current user is allowed to configure. Within a single type-checking, the user remains constant, and so your_domain's predicate returns consistent decisions. On the other hand, across different sessions by different users, the predicate will of course make different decisions. This approach allows extern types to be used for flexible type-level enforcement of security policies.

It would be a pain for users to have to mark exactly which extern type a certain expression is meant to belong to. Instead, the DomTool language implementation uses slightly non-compositional type-checking to make use of rich extern types more convenient. By "non-compositional", I mean that the value of an expression, not just its type, may matter in type-checking a larger expression that it is found within. This non-compositionality is only used to infer when a value was given a base type like string when it really ought to have been given some rich extern type. For instance, when a function is called that expects an argument of type your_domain, the argument is first type-checked. If its inferred type is your_domain, then all is well. If not, then we try again, passing the actual argument expression to your_domain's controlling predicate. Only if that predicate also rejects the expression do we signal a type error. No simplifications are performed on an expression before checking its form; for instance, a global variable's definition won't be unfolded to check if the value satisfies a required predicate.

6. Declarations

Declarations D add new symbols to the typing environment.

Syntax

Description

Effect on typing environment

extern type Symbol

Extern type

Register Symbol as a new extern type that is either defined in a plugin or treated purely syntactically.

extern val Symbol : T

Extern value

Register Symbol as a new variable that is either defined in a plugin or treated purely syntactically.

val Symbol = E

Expression synonym

Define Symbol as an abbreviation for E; a specific type for the binding is inferred and used at future occurrences.

val Symbol : T = E

Expression synonym

Define Symbol as an abbreviation for E of type T.

context CSymbol

Context ID

Declare a new context ID CSymbol.

7. Source files

A source file is an optional documentation string, followed by a sequence of semicolon-terminated declarations with optional trailing documentation strings, followed by an optional expression. A documentation string is any text between {{ and }} delimiters and may contain HTML.

Documentation strings are used in automatic HTML documentation generation. A documentation string that starts a file is used to describe that file in the module index, and it's also included at the start of that file's page. A documentation string after a declaration is used to describe it in the detail section of its file's page. The standard library documentation shows an example output.

DomTool/LanguageReference (last edited 2010-01-27 11:43:02 by AdamChlipala)