## Background

The interaction between polymorphism and side-effects has always been a troublesome problem for ML. Here is an example of a program illustrating the danger:val r = ref(fn x => x); r := (fn x => x+1); !r true;Under the basic rules of ML type inference (Hindley-Milner type inference), we would give the local variable

`r`

the polymorphic typer: ('a -> 'a) refwhere the type variable`'a`

here is implicitlyquantified, meaning the type might be more explicitly written (in an extended SML type notation) as:r: (All 'a).('a -> 'a) refWe will use this explicit quantifier notation when we want to be particularly clear that we are dealing with a polymorphic type.Given that r has this polymorphic type, the occurrence of

`r`

in the assignment expression would have the type`(int -> int) ref`

, while in the third line`r`

would have the type`(bool -> bool) ref`

. This typing of the program is clearly unsound, since it allows a function on integers to be applied to a boolean value.What is going on here is that a mutable value (here a ref cell

`r`

) is given a polymorphic type. This allows the mutable value's type to be instantiated differently at the points where the contents are updated and where they are fetched, with the consequence that the value stored and retrieved changes type in the process. The ref cell is acting as a communication channel between two points in the program that do not agree on the type of the value being communicated.Besides ref cells and arrays, another mechanism that can serve as communication channel between statically uncoordinated parts of a program are first-class continuations. These can give rise to the same sort of type unsoundness illustrated above.

This example shows that the simple rule for introducing polymorphism for let-bound variables, which suffices for purely functional languages, is not sound in a language with side effects like SML. Somehow the rule for typing let-bound variables has to be restricted to prevent this unsoundness.

SML '90 solved this problem by introducing a special, restricted form of polymorphism expressed through

imperative types. An imperative type is a polymorphic type whose bound type variable is of a special form, called an imperative type variable. Imperative type variables are distinguished by an initial underscore in the type variable name, for example`'_a`

.Notation:I'll use type variables names with lower-case names near the beginning of the alphabet like`'a`

,`'b`

and`'_a`

for implicitly generalized type variables in polymorphic types, while capitalized names near the end of the alphabet like`'X`

or`'_X`

will be used forfree, or nongeneralized, type variables. Such free type variables occur during the type checking process and are sometimes called typemetavariables. They are not allowed to occur in the top-level types produced by type checking.In the SML '90 scheme, imperative types are initially associated with certain primitive functions, such as

`ref`

, which has the imperative polymorphic type:ref: '_a -> '_a refThe imperative attribute of a type variable is propagated by substitution during type checking, so the expression`ref(fn x => x)`

has the typingref(fn x => x) : ('_X -> '_X) refwhere`'_X`

is a free imperative type variable inheriting its imperative nature from the`'_a`

in the type of`ref`

.The purpose of imperative type variables is to limit the introduction of polymorphism, i.e. to prevent certain type variables from being generalized in certain circumstances. In a value declaration like

val y = ref(fn x => x)we have seen that the type of the right hand side is('_X -> '_X) ref [1]We will only be allowed to generalize the imperative type variable`'_X`

and assign`y`

a polymorphic type if the defining expression on the right is in some sensesafe, meaning that it cannot create new ref cells that could behave polymorphically and cause the kind of unsoundness illustrated above. In this case, the right hand side clearly isnotsafe since it directly creates a ref cell. So in this case the type will not be generalized and`y`

will not have a polymorphic type. In fact, if the declaration is at top level,`y`

would not be given a type at all, because types like [1] containing free type variables such as`'_X`

are not allowed for top-level bindings.Determining exactly which expressions are safe, and may therefore have their types generalized without jeopardizing type soundness, is a very difficult problem. So we approximate safety with a simple syntactic notion, that of a

nonexpansiveexpression. Nonexpansive expressions areThe evaluation of expressions of any of these forms is trivial, and in particular cannot entail the creation of new data structures (other than the benign case of a function closure for function expressions). All other expressions, including function applications, let expressions, conditionals, etc., are by definition

- constants, including nullary constructors (e.g.
`3`

,`nil`

),- variables (e.g.
`x`

,`A.x`

), or- function expressions (e.g.
`fn x => x`

).expansive, implying that their evaluation may entail nontrivial computations thatmightlead to the creation of ref cells or other mutable data structures.So to summarize the rule used in SML '90: in a value declaration, we generalize those free type variables that do not occur free in the context environment, except that free

imperativetype variables can only be generalized (i.e. transformed into bound type variables of a polymorphic type) if the expression on the right hand side of the declaration is nonexpansive.Note:In SML/NJ, the notions of imperative type variables and nonexpansive expressions were refined into a system ofweaktype variables with different degrees of weakness indicated by a leading integer in the type variable name, e.g.`'1a, '2b`

. This scheme moderately extended the power of the type system by allowing the types of certain expansive expressions to be generalized (e.g. partial applications of polymorphic curried functions).One obvious drawback of both imperative types in SML '90 and the related weak types of SML/NJ (0.93) is that they require a special kind of type variable. The rules governing their behavior were also relatively subtle and hard to explain to novices (especially SML/NJ's weak types).

Another major problem is that imperative types often "infect" interfaces. A given signature might have two semantically equivalent implementations written in different styles, one purely functional and the other using imperative features. The types of the values in the functional implementation would be normal polymorphic types. However, even though their behavior is equivalent, the types of the corresponding values in the imperative implementation might be imperative polymorphic types. This happens because once a type variable acquires the imperative attribute through interaction with a ref cell, the attribute sticks and gets propagated through to the final type. For example, here are two equivalent implementations of the identity function:

fun id x = x (* pure *) idFun : 'a -> 'a fun id x = !(ref x) (* imperative *) idImp : '_a -> '_aA single signature that is meant to describe the common interface of both implementations must therefore specify the imperative type.signature S = sig val id : '_a -> '_a end;If the imperative implementation is introduced after the functional version already existed, it becomes necessary to go back and revise the earlier signature to change nonimperative polymorphic types to imperative types.Let's consider a slightly more subtle example demonstrating the need for restricted polymorphism. Under the naive typing rules, (without imperative types or other restrictions on type generalization), the following definitions result in the same polymorphic type being assigned to functions

`f1`

and`f2`

.fun f1 () = let val r = ref nil fun put x = r := x::nil fun get () = hd(!r) in (put, get) endAgain under the naive typing rules, we then haveval f1 = fn : unit -> ('a -> unit) * (unit -> 'a)fun f2 () : ('a -> unit) * (unit->'a) = ((fn x => ()), (fn () => raise Fail "f2"));val f2 = fn : unit -> ('a -> unit) * (unit -> 'a)val (g1,h1) = f1 ()Here the polymorphic types forval g1 = fn : 'a -> unit val h1 = fn : unit -> 'aval (g2,h2) = f2 ()val g2 = fn : 'a -> unit val h2 = fn : unit -> 'a`g1`

and`h1`

lead to trouble, because the the last value to which`g1`

is applied will be the value returned by the next call of`h1`

:g1 3; (* stores 3 in the shared ref *)Because their function bodies both refer to the local variableval it = () : unitnot(h1()); (* apply boolean operation to 3! *)val it = <> : bool `r`

,`g1`

and`h1`

are tied together by hidden shared state, and this is how calls of`g1`

and`h1`

communicate. If`g1`

and`h1`

have independent polymorphic types, the the value communicated between them can change types arbitrarily.On the other hand, the behavior of

`g2`

and`h2`

is benign:g2 3; (* 3 is just discarded *)becauseval it = () : unitnot(h2()); (* no problem, h1 does not return *)uncaught exception Fail: f2`g2`

and`h2`

are independent, and there is no hidden communication between them.We note two things about this example: (1) The polymorphic type that

`f1`

and`f2`

share does not mention the`ref`

type constructor, illustrating that there is no simple way to determine whether shared state and side effects are involved by looking at the type. (2) Typing iscompositional, meaning that we when we type declarations likeval (g1,h1) = f1 () val (g2,h2) = f2 ()we have only the previously determined types of the component elements (`f1, f2, ()`

) to work with. Since the types of`f1`

and`f2`

are the same, these two declarations must be typed the same way, regardless of what may be going on inside of`f1`

and`f2`

.In SML '90, the problem is solved by causing

`f1`

and`f2`

to have different types. The use of`ref`

in the definition of`f1`

introduces an imperative type variable, and the type checker assigns`f1`

the type:Note thatval f1 = fn : unit -> ('_a -> unit) * (unit -> '_a)`f1`

is still polymorphic because the declaration of`f1`

is equivalent toval f1 = (fn () => ...)so the right hand side of the declaration is a function expression and therefore nonexpansive.Note: I've oversimplified slightly here, since a`fun`

declaration is actually equivalent to a`val rec`

, but`val rec`

declarations are by convention always treated as nonexpansive, since the right hand sides must all be function expressions.Now when we type the suspect declaration

val (g1,h1) = f1 ()the expression`f1()`

on the right gets the type('_X -> unit) * (unit -> '_X)and since the right hand side is expansive, the imperative type variable`'_X`

will not be generalized. This will lead to an error (at top level), thus preventing the unsound use of`g1`

and`h1`

. The typing of`f2`

,`g2`

, and`h2`

proceeds as in the naive case, since no imperative type variables arise in those declarations.

## Value Polymorphism

In SML '97, we use a different technique to avoid unsoundness. We drop the distinction between imperative type variables and ordinary type variables, so once again there is only one flavor of type variable (ignoring equality type variables). This means that`f1`

and`f2`

have the same polymorphic type, as with the naive rule. We therefore cannot distinguish between dangerous (`f1()`

) and safe (`f2()`

) expressions based on type information alone. Also, without imperative type variables we can't distinguish which type variables are unsafe to generalize because they are involved in the type of a ref value. So we conservatively treat all function applications as dangerous and we do not allowanyof the type variables in their types to be generalized. In effect, we have decided to treatalltype variables as imperative. This new scheme is calledvalue polymorphism, and it is also known as thevalue restriction.As in SML '90, we approximate the class of safe-to-generalize expressions conservatively by the simple syntactic notion of a nonexpansive expression, also known as a

value expressionor simply avalue. SML '90 defined nonexpansive expressions to include the following atomic forms:It is clear that the evaluation of any of these forms is safe because it cannot entail the creation of new data structures (other than the benign case of creating function closures for function expressions). To partially compensate for the fact that we are being more restrictive in our creation of polymorphism, SML '97 enlarges the class of nonexpansive expressions to include some simple compound expressions:

- constants (e.g.
`3`

),- nullary constructors (e.g.
`nil`

),- variables (e.g.
`x`

,`A.x`

),- function expressions (e.g.
`(fn x => x)`

).All other expressions, including function applications, let expressions, conditionals, etc., are by definition

- records and tuples with nonexpansive fields (e.g.
`(3,x,nil)`

),- constructors (except
`ref`

) applied to nonexpansive arguments (e.g.`x::nil`

).expansive, implying that their evaluation may entail nontrivial computations thatmightlead to the creation of ref cells or other mutable data structures.The

value polymorphismrule for typing value declarations states that in a declaration such asvalthe types of the variables appearing in pat can be closed by generalizing those type variables appearing free in their types but not in the context if, andpat=exponly if, the expression on the right hand side is nonexpansive.Using the value restriction to control the interaction of polymorphism and side-effects had been considered during the design of Standard ML, but it was generally thought to sacrifice too much polymorphism. However, after several years of investigation of more elaborate schemes that attempted to weaken the restriction (like the weak types in SML/NJ), Andrew Wright undertook an empirical investigation of the consequences of the value restriction ("Simple Imperative Polymorphism", LISP and Symbolic Computation, 8, 343-355, 1995). He implemented the value restriction and recorded the changes that were necessary to convert a large body of existing code (the SML/NJ compiler and tools, HOL90 and Isabelle theorem provers, etc.). The conclusion was that surprisingly few changes to the code were needed, and these changes were mostly fairly simple and local. It turns out that the imperative types originally adopted were a good example of

premature optimizationin language design!Based on this revelation, we decided to adopt the value restriction in the revised SML '97 version of the language. This is one of the most evident and important changes in the language, and it is important to understand the consequences, and in particular how to modify old SML code to conform to the new restriction. We will illustrate below some typical situations where adjustments are required to conform to the value restriction.

## What does it mean in practice?

The value polymorphism scheme is more restrictive than the imperative type scheme of SML '90 and the weak types of SML/NJ 0.93, so when converting code from SML '90 to SML '97 you will occasionally need to rewrite code to satisfy the value restriction. In converting the 125K lines of SML code in the SML/NJ compiler, there were about 80 cases where code had to be modified for value polymorphism.

The consequences of a failure of the value restriction differ according to whether the declaration is at top level (in the interactive system or in the body of a structure), or local to a

`let`

expression or`local`

declaration. How top-level failures are treated is implementation dependent, but SML/NJ eliminates any nongeneralized free type variables by instantiating them to fresh dummy variables. For exampleval x = ref nil;The fresh dummy types that are generated for this purpose are named X1, X2, X3, etc. Since these dummy types are not the types of any values, the result of such a declaration is likely to be of little use. For instance, in the above example, the only value we will be able to assign to x is nil, and if we fetch the contents of x, the only thing we will be able to do with this (empty) list is calculate its length, append it to itself, etc.stdIn:6.1-6.8 Warning: type vars not generalized because ofvalue restriction are instantiated to dummy types (X1,X2,...)val x = ref [] : ?.X1 list refIf the declaration has local scope (as in a

`let`

expression or`local`

declaration), then all occurrences of the declared variable must have the same type. The free type variables left ungeneralized because of the value restriction may be instantiated by the context in which the declared variable is used, as inlet val x = ref nil (* x : 'X list ref *) in x := [3]; (* x : int list ref *) hd(!x) end;Here when the declaration ofval it = 3 : int`x`

is type checked,`x`

gets the type`'X list ref`

for some free type metavariable`'X`

, but the way`x`

is used in the body instantiates`'X`

to be`int`

.If the type variables are not instantiated locally, they may be propagated to outer scopes where they are eventually eliminated either by instantiation or generalization as in the following examples:

fun f y = let val x = ref nil in x end; (* generalized *)In the first of these examples the type variable in the type ofval f = fn : 'a -> 'b list reflet val y = let val x = ref nil in x end (* instantiated *) in y := [true]; if hd(!y) then 1 else 2 end;val it = 1 : int`x`

was generalized to`'b`

when the type of`f`

was generalized. Note that all "`fun`

" declarations automatically satisfy the value restriction, so type variables introduced when typing a "`fun`

" declaration can always be generalized. In the second example the ungeneralized (therefore free) type variable shared by the types of`x`

and`y`

is instantiated to`bool`

when the body of the`let`

-expression is type checked.What should you do when the value restriction applies and a formerly polymorphic declaration fails to be polymorphic? This depends on whether the polymorphism was actually exploited or not.

## Polymorphism not required

If the declaration is local, and the ungeneralized type variables do not escape (as in the second example above), then nothing need be done. In fact, one may not even realize that there was a value restriction failure in such a case unless one has turned on the message flag`Compiler.Control.valueRestrictionLocalWarn`

, which causes a warning message to be printed for value restriction failures in embedded declarations (i.e. declarations local to expressions or declarations).If the declaration was at top-level and only a monotype was required, a type constraint can be added to eliminate the ungeneralized (escaping) type variable:

val x : int list ref = ref nil;

## Polymorphism required

If polymorphism was required, and is no longer achieved because of the value restriction, then we need to modify the code. A very common case involves an expansive expression that computes a function, such as the partial application of a curried function to an incomplete list of arguments.fun f x y = y;In SML '90,val f = fn : 'a -> 'b -> 'bval g = f 3;stdIn:7.1-7.12 Warning: type vars not generalized because ofvalue restriction are instantiated to dummy types (X1,X2,...)val g = fn : ?.X1 -> ?.X1`g`

would have the type`'a -> 'a`

because there are no imperative type variables involved, but now in SML '97 the value restriction comes into play, forcing`g`

to have a monomorphic type (containing a dummy type`X1`

to boot!). We can fix the definition of`g`

to restore the polymorphic type by converting the expression`f 3`

into a value expression usingeta-expansion. Eta-expansion transforms an function valued expression`e`

into`(fn x => (e) x)`

(choosing the new variable`x`

to make sure that it is not free in`e`

). In this case, we eta-expand the definition of`g`

to:val g = (fn y => (f 3) y);or, equivalently:val g = fn : 'a -> 'afun g y = f 3 y;In this case, the eta-expanded definition ofval g = fn : 'a -> 'a`g`

is entirely equivalent to the original one. But in general we have to be careful, because the eta-expansion of an expression is not always semantically equivalent to that expression.One problem is termination. It may be that the expression defining

`g`

loops or raises an exception:fun f (x::l) = (fn y => y);With this eta-expanded definition ofstdIn:14.1-14.27 Warning: match nonexhaustivex :: l => ...val f = fn : 'a list -> 'b -> 'bval g = f nil;stdIn:15.1-15.14 Warning: type vars not generalized because ofvalue restriction are instantiated to dummy types (X1,X2,...)uncaught exception nonexhaustive match failureraised at: stdIn:14.16fun g y = f nil y;val g = fn : 'a -> 'a`g,`

the raising of the nonexhaustive match exception is delayed from when`g`

is defined to when`g`

is applied to an argument.Another situation where eta-expansion does not preserve semantics is where partial application causes side-effects, as the following example (borrowed from Andrew Wright) shows.

val counter = ref(ref 0);When applied to a function,val counter = ref (ref 0) : int ref reffun mkCountF f = let val x = ref 0 val f2 = fn z => (x := !x + 1; f z) in counter := x; (* call counter returned via global counter *) f2 end;val mkCountF = fn : ('a -> 'b) -> 'a -> 'b`mkCountF`

returns a new function that behaves like the argument except that a counter records the number of times it is called. The counter is returned via the global reference`counter`

. If we define a function by partially applying`mkCountF`

:fun pair x = (x,x);the functionval pair = fn : 'a -> 'a * 'aval cpair = mkCountF pair;val cpair = fn : ?.X1 -> ?.X1 * ?.X1val cpairCalls = !counter (* obtain the counter for cpair *)val cpairCalls = ref 0 : int ref`cpair`

is not polymorphic. Recovering polymorphism by eta-expansion changes the behavior:val cpair' = (fn y => mkCountF pair y)Nowval cpair' = fn : 'a -> 'a * 'a`cpair'`

is polymorphic, but no counter is produced when cpair' is defined. Instead, a new counter is produced every time cpair' is applied, which is not what we want.Another problem with eta-expansion is the potential for severe performance penalties.

fun id x = xHere the definition ofval id = fn : 'a -> 'afun g f = (heavy(); f)val g = fn : 'a -> 'aval id1 = g idval id1 = fn : ?.X1 -> ?.X1val id2 = (fn x => g id x)val id2 = fn : 'a -> 'a`id1`

causes`heavy`

to be called once, but`id1`

is not polymorphic, while the function`id2`

defined by eta-expansion is polymorphic, but`heavy`

is called every time`id2`

is applied. If`heavy`

is a very expensive function, this will probably not be acceptable.In the examples given here, the algorithmic differences between the original function and its eta-expanded version are obvious, but in real programs, the difference may be rather subtle and easily overlooked (as I found on at least one occasion when converting the SML/NJ compiler to obey the value restriction).

So eta-expansion is not a universal solution for restoring polymorphism to computed functions. When eta-expansion is not appropriate, one can sometimes rewrite the code more globally to work around the problem; e.g. by lifting a side-effect or expensive computation out of the function definition. In the very rare cases where this fails, and one can't achieve a polymorphic definition, the last resort is to duplicate code and have multiple definitions. At least one is no worse off than one would be with a language like Pascal that does not support polymorphism!

## Computed Polymorphic data

There are cases where in SML '90 one might use a non-value expression to compute a polymorphic data structure. For instance, in SML '90 the declaration`val x = rev []`

yields the polymorphic typing`x : 'a list`

. Since`rev []`

is not a value expression,`x`

will not get a polymorphic type in SML '97. If the polymorphic type of such a data structure declaration was needed, we cannot restore it by the technique of eta expansion that usually works for function declarations. As for computed function declarations where eta expansion is not appropriate, some global rewriting of the code may be required, including possibly duplicating the declaration. Fortunately this situation seems to rarely arise in practice.

## Treatment of local vs top level declarations

When the type of a variable fails to generalize in a local declaration because of the value restriction, the ungeneralized free type variables in its type may be instantiated later on when the variable is used in the body. In the following expression,let val r = ref(fn x => x) in !r 5 end;the local variable`r`

initially gets the type:r : ('X -> 'X) refThen when the body expression`!r 5`

is type checked`'X`

is unified with`int`

and thereby eliminated. Thus`r`

ends up with the type`int -> int`

and the whole let expression has type`int`

. Since`r`

has only one type in its scope, it cannot be used inconsistently, solet val r = ref(fn x => x) in !r 5; !r true end;will cause a type error in the expression`!r true`

(assuming left to right expression traversal by the type checker).It is also possible that a free type variable in a local variable's type will be generalized in a more global scope, as illustrated by the declaration

fun f () = let val r = ref(fn x => x) in !r end;where the free type variable`'X`

in the type`r : ('X -> 'X) ref`

gets eliminated by polymorphic generalization in the typing of the declaration of`f`

, with the resulting typing:val f : unit -> 'a -> 'aBut how do we deal with type variables that remain free in the type of top-level declarations because of the value restriction? We could just leave the free type variables in the type as we do for local declarations, but this would lead to inscrutable behavior at best or type unsoundness at worst, depending on how these free type variables are treated. In SML/NJ, our solution is to eliminate residual free type variables in top-level declarations by instantiating them to new dummy types. We also print a warning message indicating that type generalization failed.

val x = ref(fn x => x);The new dummy typestdIn:1.1-2.18 Warning: type vars not generalized because of value restriction are instantiated to dummy types (X1,X2,...) val x = ref fn : (?.X1 -> ?.X1) ref`X1`

introduced for this purpose will not match the type of any value, so there is very little one can do with the value of`x`

. Nevertheless, we consider it to be more informative to produce a dummy typing rather than a type error. There are also situations where the dummy typing doesn't matter, such asOS.Process.exit(OS.Process.failure);wherestdIn:1.1-19.12 Warning: type vars not generalized because of value restriction are instantiated to dummy types (X1,X2,...)OS.Process.exit: OS.Process.status -> 'aIn this case, having the value restriction generate a static error would prevent the execution of the expression, thus preventing the exit command from taking effect.The warning message for top-level value restriction violations can be suppressed by setting the appropriate compiler control flag to false (the default value is true):

Compiler.Control.valueRestrictionTopWarn := false;On the other hand, if you also want to see warning messages when local declarations fail to generalize because of the value restriction, you can set another flag to true:Compiler.Control.valueRestrictionLocalWarn := true;However, these additional warning messages for local declarations are not very useful in our experience.

## Old imperative or weak type variable notation in SML '97

How are the old "imperative" (`'_a`

) or "weak" (`'1a`

) type variables treated? The special annotation of an underscore or digit following the apostrophe has no significance in SML '97, so these will be treated as ordinary type variables; the value restriction means that all type variables are effectively treated like imperative type variables in SML '90. However, it is a good idea to remove these annotations from your SML '97 code to avoid possible confusion.

## Nonrefutable pattern restriction

A declaration such as:val [x] = nil::nilis known as a refutable pattern binding since an exception can potentially be raised depending on the value of the right hand side. In SML/NJ, the value restriction rule has been tightened to disallow generalization for such bindings, even though the defining expression for`x`

can be statically determined to be a value.This is a minor deviation from the SML '97 definition. The main justification is the simplicity it provides in the type-theoretic interpretation. The potential raising of the Bind exception can be regarded as a side-effect, and a strict version of the value restriction will suppress type generalization in these cases too.

Formally, one can associate polymorphic generalization with the introduction of an explicit type abstraction. The type checker would implicitly transform the declaration

val I = fn x => xinto the explicitly typed versionval I = (FN 'a => fn (x: 'a) => x) : (All 'a). 'a -> 'awhere "`FN 'a`

" is an explicit type abstraction. If such an explicit type abstraction issuspending(i.e. suspends evaluation of its body like a normal variable abstraction) then the abstracted expression should have no computational effect, including the potential of raising an exception. Considerval [x] = nilwhich is equivalent toval x = hd(nil) [1]which could translate into the explicit formval x = (FN 'a => hd(nil)) [2]If the type abstraction is suspending, then versions [1] and [2] are not equivalent, because [1] raises the Empty exception while [2] does not. There are technical advantages for this interpretation, which corresponds closely with the typed internal language used in SML/NJ (FLINT), so we have adopted this stricter version of the value restriction.## Nonreturning expressions

It would be possible (that is, we believe it would be sound) to weaken the value restriction by allowing generalization of types that consist of a single free type variable. For instance, in the declaration

val x = raise Fail "foo";the type of the right hand side is`'X`

(where`'X`

again represents a free type variable). This can safely be generalized to the polymorphic type`(All 'a). 'a`

, because the expression does not return a value, and hence the variable`x`

is not bound to a value and will not be used. This would also apply to applications of functions likefun loop () = loop ()or OS.Process.exit of type`(All 'a).`

for some domain typet-> 'a`.`

tWe do not admit this weakening of the value restriction in SML '97, partly because it is not consistent with the suspending type abstraction model discussed in the previous point.

SML '97 supports multiple precisions of integers (`Int31.int`

,`Int32.int`

) and words (`Word8.word`

,`Word31.word`

,`Word32.word`

). It is convenient to overload integer literals (e.g.`3`

), and word literals (e.g.`0w3`

, so SML '97 supports this.3;As this illustrates, if the context does not explicitly or implicitly determine the type of a literal, it is assigned a default type:val it = 3 : int (* int = Int.int = Int31.int *)3 : Int31.int;val it = 3 : int3 : Int32.int;val it = 3 : Int32.int0w3;val it = 0wx3 : word (* word = Word.word = Word31.word *)0w3 : Word8.word;val it = 0wx3 : Word8.word0w3 : Word31.word;val it = 0wx3 : word0w3 : Word32.word;val it = 0wx3 : Word32.word0w3 :Word.word;val it = 0wx3 : word`Int.int`

for integer literals and`Word.word`

for word literals.Although SML/NJ currently only implements one precision of floating point numbers (64 bits), the Basis definition allows for multiple floating point precisions. When multiple floating point precisions are supported in an implementations, floating point literals would also be overloaded.

Like SML '90, SML '97 provides a fixed set of overloaded functions, including arithmetic operators such as`+`

and relational operators such as`>`

. In SML '90, the context has to provide enough type information to resolve the overloaded operator (i.e. to assign it exactly one of the possible types that the operator is capable of assuming). If the overloaded operator is not resolved, an SML '90 compiler generates an error message:fun f x = x + x;In SML '97, an overloaded function that is not resolved by its context is assigned a default type (normally the integer version of the operator).std_in:10.13 Error: overloaded variable cannot be resolved: +fun f x = x + x;If you don't want the default type, you must override it by adding explicit type constraints, as in:val f = fn : int -> intfun f (x: real) = x + x;val f = fn : real -> real

When a programmer uses a type variable in a type constraint in an expression or declaration, there is a natural, but fairly complicated, rule that determines where that type variable is implicitly bound.For instance, in the code

val x = (let val id : 'a -> 'a = fn z => z in Id Id end, fn y => y);the type variable`'a`

used in the type of`id`

is implicitly bound at the declaration`val id`

, so`id`

has a polymorphic type. However, if we add another occurrence of`'a`

, as inval x = (let val id : 'a -> 'a = fn z => z in id id end, fn (y: 'a) => y);now`'a`

is implicitly bound at the`val x`

declaration. This means that`id`

is no longer polymorphic, because`'a`

acts like a monomorphic type identifier within its binding scope, and so the later declaration will not type check.Since the application of the scoping rule for such explicit type variables is a bit tricky, it would be preferable to be able to explicitly determine where the type variable is bound. SML '97 provides a notation for this. Here are some examples:

fun 'a id(z: 'a) = z val 'a id : 'a -> 'a = fn z => z fun ('a, 'b) pair (x: 'a) (y: 'b) = (x,y)So using this notation, the first example above becomesval x = (let val 'a id : 'a -> 'a = fn z => z in Id Id end, fn y => y);while the second, which still doesn't type check, becomesval 'a x = (let val id : 'a -> 'a = fn z => z in id id end, fn (y: 'a) => y);Note:There is a discrepancy between the syntax supported by SML/NJ and that of the Definition in the case of explicit type variable bindings in`val rec`

declarations. SML/NJ supports the notationval rec 'a f = (fn x: 'a => x);while the Definition would haveval 'a rec f = (fn x: 'a => x);

In the SML '90 formal definition, there were deficiencies in the way that new typenames(i.e. internal type identifiers) were managed. These deficiencies would allow a generative type name to be reused for another type while values of the original type were still in the environment. Thus according to the Definition, an expression such aslet datatype A = C of bool in fn (C x) = not x (* : A -> bool *) end let datatype B = C of int in C 2 (* : B *) endcould be successfully type checked because the internal type name for`A`

could be reused for`B`

, meaning that types`A`

and`B`

would agree and the application would be accepted. This is unsound of course, because it is equivalent to`not(2)`

.This problem is an artifact of the way the formal definition was designed; compilers (e.g. SML/NJ) typically implement generative type declarations (datatypes and abstypes) by ensuring that unique names are assigned to each declared type, and they never recycle these names. Since the SML '97 Definition is still vulnerable to this problem, the language has been restricted to ensure soundness of the Definition. The language restrictions ensure that a locally declared datatype cannot escape from the static scope of its declaration. Thus in a

`let`

expressionletthe type of the let expression, which is the type ofdecinexpend`exp`

, cannot contain any generative types declared in`dec`

. This means that an expression likelet datatype t = C in C endshould not type check in SML '97.Another way that a local generative type could escape from its scope is through type unification, as in the following expression.

fn x => let datatype t = C val _ = if true then x else C in 5 endBecause of the conditional expression,`x`

gets the type`t`

, and the entire function expression gets the type`t -> int`

, even though it extends outside the scope of the declaration of`t`

. So this sort of expression is also not allowed in SML '97.

SML/NJ Deviation:Currently, in SML/NJ both the above expressions still type check, so the new restriction is not currently enforced. As we noted before, this relates to an unsoundness in the Definition that does not affect implementations (unless they adhere too closely to the Definition's technique for managing unique type names!).

In SML '90, there was no direct way for one module to inherit a datatypeas a datatypefrom another module. For instance, suppose I have a datatype`t`

in a structure`A`

:structure A = struct datatype t = C | D of t endI can define a new structure`B`

that contains an alias of`t`

as follows:structure B = struct type t = A.t endbut then`B`

does not provide access to the data constructors of`t`

. If I define`B`

asstructure B = struct type t = A.t val C = A.C val D = A.D endI gain access to the constructors`C`

and`D`

as values, but I still can't use them in patterns.In SML '90, you can work around this problem by using the

`open`

declaration:structure B = struct open A endand if`A`

has additional components that`B`

is not supposed to export you can use a signature constraint on`B`

to filter them out. This is not an ideal solution, because it unnecessarily pollutes the name space within the body of`B`

with all the component names in`A`

.To support the sharing of datatypes between modules directly, without the use of

`open`

, SML '97 introduces a new form of datatype declaration: thedatatype replicationdeclaration. Its syntax iswhere (as the the Definition),datatypetycon=datatypelongtycon`is an identifier naming a type constructor (in the`

tyconTyConname space), and`is a symbolic path (e.g.`

longtycon`A.t`

) ending in a type constructor name. A datatype replication makes the`be an alias for the datatype named by`

tycon`, and in addition implicitly declares the data constructor names for`

longtycon`. Thus`

longtyconstructure B = struct datatype t = datatype A.t endis equivalent to the version of the declaration of`B`

using`, so`

open`B.C`

and`B.D`

are defined as constructors of`B.t`

and can be used in patterns.A common error with datatype replication declarations is to leave out the

`keyword on the right hand side, as in`

datatypestructure B = struct datatype t = A.t endThis causes`A.t`

to be treated as a constructor declaration, and an error results because a qualified name is illegal in this case. But if I had writtenstructure B = struct open A datatype t = t endThere would be no complaint from the compiler, because the`t`

on the right hand side would be treated as a valid data constructor name having nothing to do with the datatype`t`

in`A`

.As we will see in the chapter on modules, there is an analogous datatype replication

specificationfor use in signatures.

`real`

not an equality typeSML '97 uses the IEEE semantics for floating point operations. Equality for floating point numbers in the IEEE standard does not behave the way equality is expected to behave in SML. For instance,`x = x`

is not always true for floating point numbers, because`nan = nan`

is false.The function

`Real.==`

implements IEEE equality for reals, but because of its unusual behavior we no longer treat`Real.real`

as an equality type. For consistency, we also reject real literals as constants in patterns. This means that an expression likex = 5.6has to be rewritten asReal.==(x,5.6),and a case expression likecase x of 5.6 => 0 | 3.2 => 1 | ~1.7 => 2 | _ => 3has to be rewritten using conditionals:if Real.==(x,5.6) then 0 else if Real.==(x,3.2) then 1 else if Real.==(x,~1.7) then 2 else 3

Dave MacQueen Last modified: Tue Apr 4 15:59:39 EDT 2000