welcome: please sign in

Diff for "DomTool/Implementation"

Differences between revisions 2 and 6 (spanning 4 versions)
Revision 2 as of 2006-12-17 00:20:31
Size: 6331
Editor: AdamChlipala
Comment: The language interpreter
Revision 6 as of 2007-10-27 19:04:44
Size: 11961
Editor: AdamChlipala
Comment: Typo
Deletions are marked like this. Additions are marked like this.
Line 14: Line 14:

= Build process =

MLton and SML/NJ take different inputs to drive their build processes. The main Makefile is responsible for building `src/domtool.cm` (the input to SML/NJ) and `src/domtool-*.mlb` (the tool-specific inputs to MLton) from `src/sources` and some other compiler-specific files. When adding a new source file to the system, include it in `src/sources`, not any of the generated files, and take care to insert it in dependency order relative to the sources already in the file.

A C library `openssl_sml.so` is built to provide a cleaner (but spartan) interface to the OpenSSL library. The Makefile uses the [http://ttic.uchicago.edu/~blume/papers/nlffi.pdf NLFFI] tools shipped with MLton and SML/NJ to build compiler-specific SML interfaces to this library, and then compiler-agnostic code takes over and defines the visible `OpenSSL` structure based on the common interface supported by all NLFFI tools. Code specific to compiler `$COMPILER` lives in `domtool2/openssl/$COMPILER`.
Line 41: Line 47:
DomTool provides a number of ways to request callbacks when certain events occur or when certain add-on features are used. Plugins work by calling these hook functions, typically many times per plugin. DomTool provides a number of ways to request callbacks when certain events occur or when certain add-on features are used. Plugins work by calling these hook functions, typically many times per plugin. By convention, a plugin is a module defined in `domtool2/src/plugins/` that registers some callbacks as a side-effect of its definition.

The following subsections summarize the hooks that are available for DomTool plugins. There are other hooks that are only of interest when using the DomTool language implementation in a different application.

== Extern functions ==

Declared `extern val` functions can be implemented in two different ways. One hardly counts as implementation: you can leave them unimplemented and just treat them as purely syntactic entities, since some of the later callbacks that we'll cover are passed general DomTool ASTs as arguments. The second option is to register an extern function handler. `Env.registerFunction` is the hook for this.

== Actions ==

Actions are the connection between functional DomTool programs and "real-world" configuration. Call `Env.registerAction` to register the actual code that should be run when an action is encountered during `Eval`, giving the action's name and a function for transforming an environment variable mapping and a list of argument ASTs into a new environment variable mapping. These are DomTool, not UNIX, environment variables.

There is a family of convenience functions `Env.action_none`, `Env.action_one`, etc., for registering actions taking argument lists of fixed length with known types. Values of type `Env.arg` are used to encapsulate methods for extracting native SML values from DomTool ASTs of known types.

== Containers ==

Containers are actions that take actions as additional arguments, like `domain` and `vhost`. Their handlers are registered very similarly to other actions, with the addition that containers have associated callbacks that are run after all nested configuration has been processed. When a container is encountered during `Eval`, its action handler is run, then all of its nested configuration is evaluated, and finally the container's "afterward" callback is run. There are functions `Env.container_none`, `Env.container_one`, etc., that correspond to the convenience functions for regular actions.

== Extern types ==

Types declared with `extern type` are treated as refinement types. That is, each should have an associated simple type to which an additional filtering predicate is applied. `Env.type_one` is the hook to register a new extern type by giving its name, an `Env.arg` for converting its values to native SML, and a boolean predicate for deciding which values of the base type are allowed in the new type. This predicate can be arbitrary SML code. It may rely on imperativity, but it should never be visibly inconsistent in its decisions within a single type-checking. For example, our use of the DomTool language for distributed configuration has extern type handlers that use imperativity to determine the current user, what domains he may configure, etc., but this information is set before type-checking begins and doesn't change until it's over.

== Environment variable defaults ==

Call `Defaults.registerDefault` to provide a default value for an environment variable that should be set before type-checking begins. You must provide the variable's name, its type, and a (possibly impure) function for generating its initial expression value.

== Reset handlers ==

When an admin runs `domtool-admin regen`, we need a way to revert to a pristine configuration where everything users have added is gone, before we build it all back up again from scratch. `Domain.registerResetGlobal` registers a function to perform this clean-up on global (i.e., AFS) configuration, while `Domain.registerResetLocal` registers a similar function to be run on each node before regeneration. For example, the Webalizer plugin uses `registerResetGlobal` to delete all Webalizer configuration files, and the Apache plugin uses `registerResetLocal` to clear the contents of `/var/domtool/vhosts`.

== Before/after domains ==

Call `Domain.registerBefore` and `Domain.registerAfter` to register callbacks to be called before and after a `domain` directive's nested configuration is run.

== File change handlers ==

Call `Slave.registerFileHandler` to register a callback to call whenever a file's status in `$DOMTOOL/nodes` changes. See DomTool/ArchitectureOverview for more information on when such callbacks would be triggered.

== Pre/post-handlers ==

Call `Slave.registerPreHandler` and `Slave.registerPostHandler` to register functions to be called before and after a DomTool configuration session, which might include arbitrarily many domains and source files.

This page describes the implementation of the DomTool language interpreter and other tools. Most members would probably be better served visiting DomTool/UserGuide.

TableOfContents()

1. Languages

DomTool is implemented mostly in [http://en.wikipedia.org/wiki/Standard_ML Standard ML] (SML), with teeny tiny bits of C and shell script. Standard ML is a [http://en.wikipedia.org/wiki/Statically_typed statically-typed] [http://en.wikipedia.org/wiki/Functional_programming_language functional programming language] with much to recommend it, including a [http://portal.acm.org/citation.cfm?id=549659 language standard] (with formal semantics), [http://mlton.org/ one of the best open source optimizing compilers ever for any language], and open development models and communities associated with the major implementations (out of about 10 total language implementations floating around today).

But really, why choose a programming language that "nobody's ever heard of"? The answer is simple. With SML, you can program at a high level of abstraction without having to worry about performance penalties and other historical undesirables.

In the following sections, I'll often refer to SML modules by name, instead of giving source file paths. A module named Name will be defined in either domtool2/name.sml or domtool2/plugins/name.sml, depending on whether it's part of core DomTool or of a plugin. You'll also find signature NAME defined in domtool2/name.sig or domtool2/plugins/name.sig. I readily point the reader to the source code itself, and the signature files in particular, as the best sources of detailed documentation on the implementation. Readers coming from backgrounds outside of statically-typed functional programming may be pleasantly surprised at how well ML code documents itself!

Information about obtaining and building the DomTool tools is found on ["DomTool/Building"].

2. Build process

MLton and SML/NJ take different inputs to drive their build processes. The main Makefile is responsible for building src/domtool.cm (the input to SML/NJ) and src/domtool-*.mlb (the tool-specific inputs to MLton) from src/sources and some other compiler-specific files. When adding a new source file to the system, include it in src/sources, not any of the generated files, and take care to insert it in dependency order relative to the sources already in the file.

A C library openssl_sml.so is built to provide a cleaner (but spartan) interface to the OpenSSL library. The Makefile uses the [http://ttic.uchicago.edu/~blume/papers/nlffi.pdf NLFFI] tools shipped with MLton and SML/NJ to build compiler-specific SML interfaces to this library, and then compiler-agnostic code takes over and defines the visible OpenSSL structure based on the common interface supported by all NLFFI tools. Code specific to compiler $COMPILER lives in domtool2/openssl/$COMPILER.

3. Configuration

As is more and more the fashion lately, DomTool supports many tweakable configuration variables, and the particular settings of those variables are conveyed via program source code. In particular, the various pieces of the DomTool implementation look for configuration in different members of a Config module in an SML source file config.sml in the domtool2 base directory. When building the standalone tools with MLton, these configuration settings will be inlined into the places where they're used in the resulting binary, possibly triggering opportunities for further optimization. Isn't compilation technology wonderful?

Any particular installation of DomTool is unlikely to want to set custom values for all or even most of the available variables. Thus, the implementation takes modest advantage of SML's module system to allow inheritance of default settings via the open declaration, while maintaining the possibility for piecemeal setting of custom values.

DomTool involves a number of distinct plugins and sources of functionality, all of which have some configuration parameters. The implementation uses Makefile-driven concatenation of files following a certain convention to build the overall default configuration module from files associated with the separate plugins. In particular, in domtool2/configDefault, you will find a set of .cfg, .cfs, and .csg files. All the .cfs files are concatenated together to form the definition of the signature CONFIG, while .csg files are concatenated together to form supporting definitions of sub-signatures. The .cfg files are concatenated together to form the definition of a structure ConfigDefault ascribing opaquely to CONFIG. Your custom configuration structure Config also ascribes to CONFIG and may open ConfigDefault.

4. The language interpreter

The process of reading, checking, and running a DomTool source file goes like this:

  1. The lexer breaks the textual input into tokens. It's embodied by the DomtoolLexFn functor, built by ml-lex from domtool.lex.

  2. The parser converts the stream of tokens into an abstract syntax tree (AST). It's embodied by the DomtoolLrValsFn functor, built by ml-yacc from domtool.grm. The Parse module ties together the lexer and parser.

  3. The Tycheck module type-checks the AST.

  4. The Reduce module applies familiar lambda calculus-style reduction rules to simplify the AST as much as possible.

  5. For input files that request configuration rather than just add definitions, the Eval module executes the resulting configuration value.

Every piece of this pipeline is independent of the distributed configuration aspect of DomTool described on DomTool/ArchitectureOverview, though every stage after the parser provides hooks that can be used to conscript the language implementation for use in that and other applications.

One important hook of this kind in Tycheck is in the form of its members allowExterns and disallowExterns. Call the appropriate one of these functions to set whether or not extern type and extern val declarations should be allowed in the source file to check.

As DomTool/LanguageReference explains, all configuration takes place through the configuration monad, which has a lot in common with the [http://www.haskell.org/ Haskell] [http://www.nomaware.com/monads/html/iomonad.html IO monad]. Haskell newcomers often have trouble understanding how the IO monad enables the use of imperative code within a pure functional language. My favorite explanation for this is that values in the IO monad are runtime representations of programs in an embedded imperative language, which you hope will be run by some entity outside the scope of the Haskell language. In the DomTool implementation, this idea appears quite literally. The Reduce module handles the "pure functional" aspects of the language semantics, reducing input programs into first-order imperative programs, in the form of configuration values. Eval is the component that actually runs the resulting configuration, like the mythical top-level IO-meister in Haskell.

5. Plugin architecture

DomTool provides a number of ways to request callbacks when certain events occur or when certain add-on features are used. Plugins work by calling these hook functions, typically many times per plugin. By convention, a plugin is a module defined in domtool2/src/plugins/ that registers some callbacks as a side-effect of its definition.

The following subsections summarize the hooks that are available for DomTool plugins. There are other hooks that are only of interest when using the DomTool language implementation in a different application.

5.1. Extern functions

Declared extern val functions can be implemented in two different ways. One hardly counts as implementation: you can leave them unimplemented and just treat them as purely syntactic entities, since some of the later callbacks that we'll cover are passed general DomTool ASTs as arguments. The second option is to register an extern function handler. Env.registerFunction is the hook for this.

5.2. Actions

Actions are the connection between functional DomTool programs and "real-world" configuration. Call Env.registerAction to register the actual code that should be run when an action is encountered during Eval, giving the action's name and a function for transforming an environment variable mapping and a list of argument ASTs into a new environment variable mapping. These are DomTool, not UNIX, environment variables.

There is a family of convenience functions Env.action_none, Env.action_one, etc., for registering actions taking argument lists of fixed length with known types. Values of type Env.arg are used to encapsulate methods for extracting native SML values from DomTool ASTs of known types.

5.3. Containers

Containers are actions that take actions as additional arguments, like domain and vhost. Their handlers are registered very similarly to other actions, with the addition that containers have associated callbacks that are run after all nested configuration has been processed. When a container is encountered during Eval, its action handler is run, then all of its nested configuration is evaluated, and finally the container's "afterward" callback is run. There are functions Env.container_none, Env.container_one, etc., that correspond to the convenience functions for regular actions.

5.4. Extern types

Types declared with extern type are treated as refinement types. That is, each should have an associated simple type to which an additional filtering predicate is applied. Env.type_one is the hook to register a new extern type by giving its name, an Env.arg for converting its values to native SML, and a boolean predicate for deciding which values of the base type are allowed in the new type. This predicate can be arbitrary SML code. It may rely on imperativity, but it should never be visibly inconsistent in its decisions within a single type-checking. For example, our use of the DomTool language for distributed configuration has extern type handlers that use imperativity to determine the current user, what domains he may configure, etc., but this information is set before type-checking begins and doesn't change until it's over.

5.5. Environment variable defaults

Call Defaults.registerDefault to provide a default value for an environment variable that should be set before type-checking begins. You must provide the variable's name, its type, and a (possibly impure) function for generating its initial expression value.

5.6. Reset handlers

When an admin runs domtool-admin regen, we need a way to revert to a pristine configuration where everything users have added is gone, before we build it all back up again from scratch. Domain.registerResetGlobal registers a function to perform this clean-up on global (i.e., AFS) configuration, while Domain.registerResetLocal registers a similar function to be run on each node before regeneration. For example, the Webalizer plugin uses registerResetGlobal to delete all Webalizer configuration files, and the Apache plugin uses registerResetLocal to clear the contents of /var/domtool/vhosts.

5.7. Before/after domains

Call Domain.registerBefore and Domain.registerAfter to register callbacks to be called before and after a domain directive's nested configuration is run.

5.8. File change handlers

Call Slave.registerFileHandler to register a callback to call whenever a file's status in $DOMTOOL/nodes changes. See DomTool/ArchitectureOverview for more information on when such callbacks would be triggered.

5.9. Pre/post-handlers

Call Slave.registerPreHandler and Slave.registerPostHandler to register functions to be called before and after a DomTool configuration session, which might include arbitrarily many domains and source files.

DomTool/Implementation (last edited 2010-03-18 14:05:35 by AdamChlipala)