Previous Next Contents

4   The lexical analyzer

Let the name for the parser given in the %name declaration be denoted by {n} and the specification file name be denoted by {spec name} The parser generator creates a functor named {n}LrValsFun for the values needed for a particular parser. This functor is placed in {spec name}.sml. This functor contains a structure Tokens which allows you to construct terminals from the appropriate values. The structure has a function for each terminal that takes a tuple consisting of the value for the terminal (if there is any), a leftmost position for the terminal, and a rightmost position for the terminal and constructs the terminal from these values.

A signature for the structure Tokens is created and placed in the ``.sig'' file created by ML-Yacc. This signature is {n}_TOKENS, where {n} is the name given in the parser specification. A signature {n}_LRVALS is created for the structure produced by applying {n}LrValsFun.

Use the signature {n}_TOKENS to create a functor for the lexical analyzer which takes the structure Tokens as an argument. The signature {n}_TOKENS will not change unless the %term declaration in a specification is altered by adding terminals or changing the types of terminals. You do not need to recompile the lexical analyzer functor each time the specification for the parser is changed if the signature {n}_TOKENS does not change.

If you are using ML-Lex to create the lexical analyzer, you can turn the lexer structure into a functor using the %header declaration. %header allows the user to define the header for a structure body.

If the name of the parser in the specification were Calc, you would add this declaration to the specification for the lexical analyzer:

%header (functor CalcLexFun(structure Tokens : Calc_TOKENS))
You must define the following in the user definitions section:

type pos
This is the type of position values for terminals. This type must be the same as the one declared in the specification for the grammar. Note, however, that this type is not available in the Tokens structure that parameterizes the lexer functor.

You must include the following code in the user definitions section of the ML-Lex specification:

type svalue = Tokens.svalue
type ('a,'b) token = ('a,'b) Tokens.token
type lexresult  = (svalue,pos) token
These types are used to give lexers signatures.

You may use a lexer constructed using ML-Lex with the %arg declaration, but you must follow special instructions for tying the parser and lexer together.
Previous Next Contents