BackgroundThe interaction between polymorphism and side-effects has always been a troublesome problem for ML. Here is an example of a program illustrating the danger:val r = ref(fn x => x); r := (fn x => x+1); !r true;
Under the basic rules of ML type inference (Hindley-Milner type inference), we would give the local variable
rthe polymorphic typer: ('a -> 'a) refwhere the type variable
'ahere is implicitly quantified, meaning the type might be more explicitly written (in an extended SML type notation) as:r: (All 'a).('a -> 'a) refWe will use this explicit quantifier notation when we want to be particularly clear that we are dealing with a polymorphic type.
Given that r has this polymorphic type, the occurrence of
rin the assignment expression would have the type
(int -> int) ref, while in the third line
rwould have the type
(bool -> bool) ref. This typing of the program is clearly unsound, since it allows a function on integers to be applied to a boolean value.
What is going on here is that a mutable value (here a ref cell
r) is given a polymorphic type. This allows the mutable value's type to be instantiated differently at the points where the contents are updated and where they are fetched, with the consequence that the value stored and retrieved changes type in the process. The ref cell is acting as a communication channel between two points in the program that do not agree on the type of the value being communicated.
Besides ref cells and arrays, another mechanism that can serve as communication channel between statically uncoordinated parts of a program are first-class continuations. These can give rise to the same sort of type unsoundness illustrated above.
This example shows that the simple rule for introducing polymorphism for let-bound variables, which suffices for purely functional languages, is not sound in a language with side effects like SML. Somehow the rule for typing let-bound variables has to be restricted to prevent this unsoundness.
SML '90 solved this problem by introducing a special, restricted form of polymorphism expressed through imperative types. An imperative type is a polymorphic type whose bound type variable is of a special form, called an imperative type variable. Imperative type variables are distinguished by an initial underscore in the type variable name, for example
'_a.Notation: I'll use type variables names with lower-case names near the beginning of the alphabet like
'_afor implicitly generalized type variables in polymorphic types, while capitalized names near the end of the alphabet like
'_Xwill be used for free, or nongeneralized, type variables. Such free type variables occur during the type checking process and are sometimes called type metavariables. They are not allowed to occur in the top-level types produced by type checking.
In the SML '90 scheme, imperative types are initially associated with certain primitive functions, such as
ref, which has the imperative polymorphic type:ref: '_a -> '_a refThe imperative attribute of a type variable is propagated by substitution during type checking, so the expression
ref(fn x => x)has the typingref(fn x => x) : ('_X -> '_X) refwhere
'_Xis a free imperative type variable inheriting its imperative nature from the
'_ain the type of
The purpose of imperative type variables is to limit the introduction of polymorphism, i.e. to prevent certain type variables from being generalized in certain circumstances. In a value declaration likeval y = ref(fn x => x)we have seen that the type of the right hand side is('_X -> '_X) ref We will only be allowed to generalize the imperative type variable
ya polymorphic type if the defining expression on the right is in some sense safe, meaning that it cannot create new ref cells that could behave polymorphically and cause the kind of unsoundness illustrated above. In this case, the right hand side clearly is not safe since it directly creates a ref cell. So in this case the type will not be generalized and
ywill not have a polymorphic type. In fact, if the declaration is at top level,
ywould not be given a type at all, because types like  containing free type variables such as
'_Xare not allowed for top-level bindings.
Determining exactly which expressions are safe, and may therefore have their types generalized without jeopardizing type soundness, is a very difficult problem. So we approximate safety with a simple syntactic notion, that of a nonexpansive expression. Nonexpansive expressions are
The evaluation of expressions of any of these forms is trivial, and in particular cannot entail the creation of new data structures (other than the benign case of a function closure for function expressions). All other expressions, including function applications, let expressions, conditionals, etc., are by definition expansive, implying that their evaluation may entail nontrivial computations that might lead to the creation of ref cells or other mutable data structures.
- constants, including nullary constructors (e.g.
- variables (e.g.
- function expressions (e.g.
fn x => x).
So to summarize the rule used in SML '90: in a value declaration, we generalize those free type variables that do not occur free in the context environment, except that free imperative type variables can only be generalized (i.e. transformed into bound type variables of a polymorphic type) if the expression on the right hand side of the declaration is nonexpansive.Note: In SML/NJ, the notions of imperative type variables and nonexpansive expressions were refined into a system of weak type variables with different degrees of weakness indicated by a leading integer in the type variable name, e.g.
'1a, '2b. This scheme moderately extended the power of the type system by allowing the types of certain expansive expressions to be generalized (e.g. partial applications of polymorphic curried functions).
One obvious drawback of both imperative types in SML '90 and the related weak types of SML/NJ (0.93) is that they require a special kind of type variable. The rules governing their behavior were also relatively subtle and hard to explain to novices (especially SML/NJ's weak types).
Another major problem is that imperative types often "infect" interfaces. A given signature might have two semantically equivalent implementations written in different styles, one purely functional and the other using imperative features. The types of the values in the functional implementation would be normal polymorphic types. However, even though their behavior is equivalent, the types of the corresponding values in the imperative implementation might be imperative polymorphic types. This happens because once a type variable acquires the imperative attribute through interaction with a ref cell, the attribute sticks and gets propagated through to the final type. For example, here are two equivalent implementations of the identity function:fun id x = x (* pure *) idFun : 'a -> 'a fun id x = !(ref x) (* imperative *) idImp : '_a -> '_aA single signature that is meant to describe the common interface of both implementations must therefore specify the imperative type.signature S = sig val id : '_a -> '_a end;If the imperative implementation is introduced after the functional version already existed, it becomes necessary to go back and revise the earlier signature to change nonimperative polymorphic types to imperative types.
Let's consider a slightly more subtle example demonstrating the need for restricted polymorphism. Under the naive typing rules, (without imperative types or other restrictions on type generalization), the following definitions result in the same polymorphic type being assigned to functions
f2.fun f1 () = let val r = ref nil fun put x = r := x::nil fun get () = hd(!r) in (put, get) end val f1 = fn : unit -> ('a -> unit) * (unit -> 'a) fun f2 () : ('a -> unit) * (unit->'a) = ((fn x => ()), (fn () => raise Fail "f2")); val f2 = fn : unit -> ('a -> unit) * (unit -> 'a)Again under the naive typing rules, we then haveval (g1,h1) = f1 () val g1 = fn : 'a -> unit val h1 = fn : unit -> 'a val (g2,h2) = f2 () val g2 = fn : 'a -> unit val h2 = fn : unit -> 'aHere the polymorphic types for
h1lead to trouble, because the the last value to which
g1is applied will be the value returned by the next call of
h1:g1 3; (* stores 3 in the shared ref *) val it = () : unit not(h1()); (* apply boolean operation to 3! *) val it = <Because their function bodies both refer to the local variable
> : bool
h1are tied together by hidden shared state, and this is how calls of
h1have independent polymorphic types, the the value communicated between them can change types arbitrarily.
On the other hand, the behavior of
h2is benign:g2 3; (* 3 is just discarded *) val it = () : unit not(h2()); (* no problem, h1 does not return *) uncaught exception Fail: f2because
h2are independent, and there is no hidden communication between them.
We note two things about this example: (1) The polymorphic type that
f2share does not mention the
reftype constructor, illustrating that there is no simple way to determine whether shared state and side effects are involved by looking at the type. (2) Typing is compositional, meaning that we when we type declarations likeval (g1,h1) = f1 () val (g2,h2) = f2 ()we have only the previously determined types of the component elements (
f1, f2, ()) to work with. Since the types of
f2are the same, these two declarations must be typed the same way, regardless of what may be going on inside of
In SML '90, the problem is solved by causing
f2to have different types. The use of
refin the definition of
f1introduces an imperative type variable, and the type checker assigns
f1the type:val f1 = fn : unit -> ('_a -> unit) * (unit -> '_a)Note that
f1is still polymorphic because the declaration of
f1is equivalent toval f1 = (fn () => ...)so the right hand side of the declaration is a function expression and therefore nonexpansive.Note: I've oversimplified slightly here, since a
fundeclaration is actually equivalent to a
val rec, but
val recdeclarations are by convention always treated as nonexpansive, since the right hand sides must all be function expressions.
Now when we type the suspect declarationval (g1,h1) = f1 ()the expression
f1()on the right gets the type('_X -> unit) * (unit -> '_X)and since the right hand side is expansive, the imperative type variable
'_Xwill not be generalized. This will lead to an error (at top level), thus preventing the unsound use of
h1. The typing of
h2proceeds as in the naive case, since no imperative type variables arise in those declarations.
Value PolymorphismIn SML '97, we use a different technique to avoid unsoundness. We drop the distinction between imperative type variables and ordinary type variables, so once again there is only one flavor of type variable (ignoring equality type variables). This means that
f2have the same polymorphic type, as with the naive rule. We therefore cannot distinguish between dangerous (
f1()) and safe (
f2()) expressions based on type information alone. Also, without imperative type variables we can't distinguish which type variables are unsafe to generalize because they are involved in the type of a ref value. So we conservatively treat all function applications as dangerous and we do not allow any of the type variables in their types to be generalized. In effect, we have decided to treat all type variables as imperative. This new scheme is called value polymorphism, and it is also known as the value restriction.
As in SML '90, we approximate the class of safe-to-generalize expressions conservatively by the simple syntactic notion of a nonexpansive expression, also known as a value expression or simply a value. SML '90 defined nonexpansive expressions to include the following atomic forms:
It is clear that the evaluation of any of these forms is safe because it cannot entail the creation of new data structures (other than the benign case of creating function closures for function expressions). To partially compensate for the fact that we are being more restrictive in our creation of polymorphism, SML '97 enlarges the class of nonexpansive expressions to include some simple compound expressions:
- constants (e.g.
- nullary constructors (e.g.
- variables (e.g.
- function expressions (e.g.
(fn x => x)).
All other expressions, including function applications, let expressions, conditionals, etc., are by definition expansive, implying that their evaluation may entail nontrivial computations that might lead to the creation of ref cells or other mutable data structures.
- records and tuples with nonexpansive fields (e.g.
- constructors (except
ref) applied to nonexpansive arguments (e.g.
The value polymorphism rule for typing value declarations states that in a declaration such asval pat = expthe types of the variables appearing in pat can be closed by generalizing those type variables appearing free in their types but not in the context if, and only if, the expression on the right hand side is nonexpansive.
Using the value restriction to control the interaction of polymorphism and side-effects had been considered during the design of Standard ML, but it was generally thought to sacrifice too much polymorphism. However, after several years of investigation of more elaborate schemes that attempted to weaken the restriction (like the weak types in SML/NJ), Andrew Wright undertook an empirical investigation of the consequences of the value restriction ("Simple Imperative Polymorphism", LISP and Symbolic Computation, 8, 343-355, 1995). He implemented the value restriction and recorded the changes that were necessary to convert a large body of existing code (the SML/NJ compiler and tools, HOL90 and Isabelle theorem provers, etc.). The conclusion was that surprisingly few changes to the code were needed, and these changes were mostly fairly simple and local. It turns out that the imperative types originally adopted were a good example of premature optimization in language design!
Based on this revelation, we decided to adopt the value restriction in the revised SML '97 version of the language. This is one of the most evident and important changes in the language, and it is important to understand the consequences, and in particular how to modify old SML code to conform to the new restriction. We will illustrate below some typical situations where adjustments are required to conform to the value restriction.
What does it mean in practice?
The value polymorphism scheme is more restrictive than the imperative type scheme of SML '90 and the weak types of SML/NJ 0.93, so when converting code from SML '90 to SML '97 you will occasionally need to rewrite code to satisfy the value restriction. In converting the 125K lines of SML code in the SML/NJ compiler, there were about 80 cases where code had to be modified for value polymorphism.
The consequences of a failure of the value restriction differ according to whether the declaration is at top level (in the interactive system or in the body of a structure), or local to a
localdeclaration. How top-level failures are treated is implementation dependent, but SML/NJ eliminates any nongeneralized free type variables by instantiating them to fresh dummy variables. For exampleval x = ref nil; stdIn:6.1-6.8 Warning: type vars not generalized because of value restriction are instantiated to dummy types (X1,X2,...) val x = ref  : ?.X1 list refThe fresh dummy types that are generated for this purpose are named X1, X2, X3, etc. Since these dummy types are not the types of any values, the result of such a declaration is likely to be of little use. For instance, in the above example, the only value we will be able to assign to x is nil, and if we fetch the contents of x, the only thing we will be able to do with this (empty) list is calculate its length, append it to itself, etc.
If the declaration has local scope (as in a
localdeclaration), then all occurrences of the declared variable must have the same type. The free type variables left ungeneralized because of the value restriction may be instantiated by the context in which the declared variable is used, as inlet val x = ref nil (* x : 'X list ref *) in x := ; (* x : int list ref *) hd(!x) end; val it = 3 : intHere when the declaration of
xis type checked,
xgets the type
'X list reffor some free type metavariable
'X, but the way
xis used in the body instantiates
If the type variables are not instantiated locally, they may be propagated to outer scopes where they are eventually eliminated either by instantiation or generalization as in the following examples:fun f y = let val x = ref nil in x end; (* generalized *) val f = fn : 'a -> 'b list ref let val y = let val x = ref nil in x end (* instantiated *) in y := [true]; if hd(!y) then 1 else 2 end; val it = 1 : intIn the first of these examples the type variable in the type of
xwas generalized to
'bwhen the type of
fwas generalized. Note that all "
fun" declarations automatically satisfy the value restriction, so type variables introduced when typing a "
fun" declaration can always be generalized. In the second example the ungeneralized (therefore free) type variable shared by the types of
yis instantiated to
boolwhen the body of the
let-expression is type checked.
What should you do when the value restriction applies and a formerly polymorphic declaration fails to be polymorphic? This depends on whether the polymorphism was actually exploited or not.
Polymorphism not requiredIf the declaration is local, and the ungeneralized type variables do not escape (as in the second example above), then nothing need be done. In fact, one may not even realize that there was a value restriction failure in such a case unless one has turned on the message flag
Compiler.Control.valueRestrictionLocalWarn, which causes a warning message to be printed for value restriction failures in embedded declarations (i.e. declarations local to expressions or declarations).
If the declaration was at top-level and only a monotype was required, a type constraint can be added to eliminate the ungeneralized (escaping) type variable:val x : int list ref = ref nil;
Polymorphism requiredIf polymorphism was required, and is no longer achieved because of the value restriction, then we need to modify the code. A very common case involves an expansive expression that computes a function, such as the partial application of a curried function to an incomplete list of arguments.fun f x y = y; val f = fn : 'a -> 'b -> 'b val g = f 3; stdIn:7.1-7.12 Warning: type vars not generalized because of value restriction are instantiated to dummy types (X1,X2,...) val g = fn : ?.X1 -> ?.X1In SML '90,
gwould have the type
'a -> 'abecause there are no imperative type variables involved, but now in SML '97 the value restriction comes into play, forcing
gto have a monomorphic type (containing a dummy type
X1to boot!). We can fix the definition of
gto restore the polymorphic type by converting the expression
f 3into a value expression using eta-expansion. Eta-expansion transforms an function valued expression
(fn x => (e) x)(choosing the new variable
xto make sure that it is not free in
e). In this case, we eta-expand the definition of
gto:val g = (fn y => (f 3) y); val g = fn : 'a -> 'aor, equivalently:fun g y = f 3 y; val g = fn : 'a -> 'aIn this case, the eta-expanded definition of
gis entirely equivalent to the original one. But in general we have to be careful, because the eta-expansion of an expression is not always semantically equivalent to that expression.
One problem is termination. It may be that the expression defining
gloops or raises an exception:fun f (x::l) = (fn y => y); stdIn:14.1-14.27 Warning: match nonexhaustive x :: l => ... val f = fn : 'a list -> 'b -> 'b val g = f nil; stdIn:15.1-15.14 Warning: type vars not generalized because of value restriction are instantiated to dummy types (X1,X2,...) uncaught exception nonexhaustive match failure raised at: stdIn:14.16 fun g y = f nil y; val g = fn : 'a -> 'aWith this eta-expanded definition of
g,the raising of the nonexhaustive match exception is delayed from when
gis defined to when
gis applied to an argument.
Another situation where eta-expansion does not preserve semantics is where partial application causes side-effects, as the following example (borrowed from Andrew Wright) shows.val counter = ref(ref 0); val counter = ref (ref 0) : int ref ref fun mkCountF f = let val x = ref 0 val f2 = fn z => (x := !x + 1; f z) in counter := x; (* call counter returned via global counter *) f2 end; val mkCountF = fn : ('a -> 'b) -> 'a -> 'bWhen applied to a function,
mkCountFreturns a new function that behaves like the argument except that a counter records the number of times it is called. The counter is returned via the global reference
counter. If we define a function by partially applying
mkCountF:fun pair x = (x,x); val pair = fn : 'a -> 'a * 'a val cpair = mkCountF pair; val cpair = fn : ?.X1 -> ?.X1 * ?.X1 val cpairCalls = !counter (* obtain the counter for cpair *) val cpairCalls = ref 0 : int refthe function
cpairis not polymorphic. Recovering polymorphism by eta-expansion changes the behavior:val cpair' = (fn y => mkCountF pair y) val cpair' = fn : 'a -> 'a * 'aNow
cpair'is polymorphic, but no counter is produced when cpair' is defined. Instead, a new counter is produced every time cpair' is applied, which is not what we want.
Another problem with eta-expansion is the potential for severe performance penalties.fun id x = x val id = fn : 'a -> 'a fun g f = (heavy(); f) val g = fn : 'a -> 'a val id1 = g id val id1 = fn : ?.X1 -> ?.X1 val id2 = (fn x => g id x) val id2 = fn : 'a -> 'aHere the definition of
heavyto be called once, but
id1is not polymorphic, while the function
id2defined by eta-expansion is polymorphic, but
heavyis called every time
id2is applied. If
heavyis a very expensive function, this will probably not be acceptable.
In the examples given here, the algorithmic differences between the original function and its eta-expanded version are obvious, but in real programs, the difference may be rather subtle and easily overlooked (as I found on at least one occasion when converting the SML/NJ compiler to obey the value restriction).
So eta-expansion is not a universal solution for restoring polymorphism to computed functions. When eta-expansion is not appropriate, one can sometimes rewrite the code more globally to work around the problem; e.g. by lifting a side-effect or expensive computation out of the function definition. In the very rare cases where this fails, and one can't achieve a polymorphic definition, the last resort is to duplicate code and have multiple definitions. At least one is no worse off than one would be with a language like Pascal that does not support polymorphism!
Computed Polymorphic dataThere are cases where in SML '90 one might use a non-value expression to compute a polymorphic data structure. For instance, in SML '90 the declaration
val x = rev yields the polymorphic typing
x : 'a list. Since
rev is not a value expression,
xwill not get a polymorphic type in SML '97. If the polymorphic type of such a data structure declaration was needed, we cannot restore it by the technique of eta expansion that usually works for function declarations. As for computed function declarations where eta expansion is not appropriate, some global rewriting of the code may be required, including possibly duplicating the declaration. Fortunately this situation seems to rarely arise in practice.
Treatment of local vs top level declarationsWhen the type of a variable fails to generalize in a local declaration because of the value restriction, the ungeneralized free type variables in its type may be instantiated later on when the variable is used in the body. In the following expression,let val r = ref(fn x => x) in !r 5 end;the local variable
rinitially gets the type:r : ('X -> 'X) refThen when the body expression
!r 5is type checked
'Xis unified with
intand thereby eliminated. Thus
rends up with the type
int -> intand the whole let expression has type
rhas only one type in its scope, it cannot be used inconsistently, solet val r = ref(fn x => x) in !r 5; !r true end;will cause a type error in the expression
!r true(assuming left to right expression traversal by the type checker).
It is also possible that a free type variable in a local variable's type will be generalized in a more global scope, as illustrated by the declarationfun f () = let val r = ref(fn x => x) in !r end;where the free type variable
'Xin the type
r : ('X -> 'X) refgets eliminated by polymorphic generalization in the typing of the declaration of
f, with the resulting typing:val f : unit -> 'a -> 'a
But how do we deal with type variables that remain free in the type of top-level declarations because of the value restriction? We could just leave the free type variables in the type as we do for local declarations, but this would lead to inscrutable behavior at best or type unsoundness at worst, depending on how these free type variables are treated. In SML/NJ, our solution is to eliminate residual free type variables in top-level declarations by instantiating them to new dummy types. We also print a warning message indicating that type generalization failed.val x = ref(fn x => x); stdIn:1.1-2.18 Warning: type vars not generalized because of value restriction are instantiated to dummy types (X1,X2,...) val x = ref fn : (?.X1 -> ?.X1) refThe new dummy type
X1introduced for this purpose will not match the type of any value, so there is very little one can do with the value of
x. Nevertheless, we consider it to be more informative to produce a dummy typing rather than a type error. There are also situations where the dummy typing doesn't matter, such asOS.Process.exit(OS.Process.failure); stdIn:1.1-19.12 Warning: type vars not generalized because of value restriction are instantiated to dummy types (X1,X2,...)whereOS.Process.exit: OS.Process.status -> 'aIn this case, having the value restriction generate a static error would prevent the execution of the expression, thus preventing the exit command from taking effect.
The warning message for top-level value restriction violations can be suppressed by setting the appropriate compiler control flag to false (the default value is true):Compiler.Control.valueRestrictionTopWarn := false;On the other hand, if you also want to see warning messages when local declarations fail to generalize because of the value restriction, you can set another flag to true:Compiler.Control.valueRestrictionLocalWarn := true;However, these additional warning messages for local declarations are not very useful in our experience.
Old imperative or weak type variable notation in SML '97How are the old "imperative" (
'_a) or "weak" (
'1a) type variables treated? The special annotation of an underscore or digit following the apostrophe has no significance in SML '97, so these will be treated as ordinary type variables; the value restriction means that all type variables are effectively treated like imperative type variables in SML '90. However, it is a good idea to remove these annotations from your SML '97 code to avoid possible confusion.
Nonrefutable pattern restrictionA declaration such as:val [x] = nil::nilis known as a refutable pattern binding since an exception can potentially be raised depending on the value of the right hand side. In SML/NJ, the value restriction rule has been tightened to disallow generalization for such bindings, even though the defining expression for
xcan be statically determined to be a value.
This is a minor deviation from the SML '97 definition. The main justification is the simplicity it provides in the type-theoretic interpretation. The potential raising of the Bind exception can be regarded as a side-effect, and a strict version of the value restriction will suppress type generalization in these cases too.
Formally, one can associate polymorphic generalization with the introduction of an explicit type abstraction. The type checker would implicitly transform the declarationval I = fn x => xinto the explicitly typed versionval I = (FN 'a => fn (x: 'a) => x) : (All 'a). 'a -> 'awhere "
FN 'a" is an explicit type abstraction. If such an explicit type abstraction is suspending (i.e. suspends evaluation of its body like a normal variable abstraction) then the abstracted expression should have no computational effect, including the potential of raising an exception. Considerval [x] = nilwhich is equivalent toval x = hd(nil) which could translate into the explicit formval x = (FN 'a => hd(nil)) If the type abstraction is suspending, then versions  and  are not equivalent, because  raises the Empty exception while  does not. There are technical advantages for this interpretation, which corresponds closely with the typed internal language used in SML/NJ (FLINT), so we have adopted this stricter version of the value restriction.
It would be possible (that is, we believe it would be sound) to weaken the value restriction by allowing generalization of types that consist of a single free type variable. For instance, in the declarationval x = raise Fail "foo";the type of the right hand side is
'Xagain represents a free type variable). This can safely be generalized to the polymorphic type
(All 'a). 'a, because the expression does not return a value, and hence the variable
xis not bound to a value and will not be used. This would also apply to applications of functions likefun loop () = loop ()or OS.Process.exit of type
(All 'a). t -> 'afor some domain type
We do not admit this weakening of the value restriction in SML '97, partly because it is not consistent with the suspending type abstraction model discussed in the previous point.
SML '97 supports multiple precisions of integers (
Int32.int) and words (
Word32.word). It is convenient to overload integer literals (e.g.
3), and word literals (e.g.
0w3, so SML '97 supports this.3; val it = 3 : int (* int = Int.int = Int31.int *) 3 : Int31.int; val it = 3 : int 3 : Int32.int; val it = 3 : Int32.int 0w3; val it = 0wx3 : word (* word = Word.word = Word31.word *) 0w3 : Word8.word; val it = 0wx3 : Word8.word 0w3 : Word31.word; val it = 0wx3 : word 0w3 : Word32.word; val it = 0wx3 : Word32.word 0w3 :Word.word; val it = 0wx3 : wordAs this illustrates, if the context does not explicitly or implicitly determine the type of a literal, it is assigned a default type:
Int.intfor integer literals and
Word.wordfor word literals.
Although SML/NJ currently only implements one precision of floating point numbers (64 bits), the Basis definition allows for multiple floating point precisions. When multiple floating point precisions are supported in an implementations, floating point literals would also be overloaded.
Like SML '90, SML '97 provides a fixed set of overloaded functions, including arithmetic operators such as
+and relational operators such as
>. In SML '90, the context has to provide enough type information to resolve the overloaded operator (i.e. to assign it exactly one of the possible types that the operator is capable of assuming). If the overloaded operator is not resolved, an SML '90 compiler generates an error message:fun f x = x + x; std_in:10.13 Error: overloaded variable cannot be resolved: +In SML '97, an overloaded function that is not resolved by its context is assigned a default type (normally the integer version of the operator).fun f x = x + x; val f = fn : int -> intIf you don't want the default type, you must override it by adding explicit type constraints, as in:fun f (x: real) = x + x; val f = fn : real -> real
When a programmer uses a type variable in a type constraint in an expression or declaration, there is a natural, but fairly complicated, rule that determines where that type variable is implicitly bound.
For instance, in the codeval x = (let val id : 'a -> 'a = fn z => z in Id Id end, fn y => y);the type variable
'aused in the type of
idis implicitly bound at the declaration
val id, so
idhas a polymorphic type. However, if we add another occurrence of
'a, as inval x = (let val id : 'a -> 'a = fn z => z in id id end, fn (y: 'a) => y);now
'ais implicitly bound at the
val xdeclaration. This means that
idis no longer polymorphic, because
'aacts like a monomorphic type identifier within its binding scope, and so the later declaration will not type check.
Since the application of the scoping rule for such explicit type variables is a bit tricky, it would be preferable to be able to explicitly determine where the type variable is bound. SML '97 provides a notation for this. Here are some examples:fun 'a id(z: 'a) = z val 'a id : 'a -> 'a = fn z => z fun ('a, 'b) pair (x: 'a) (y: 'b) = (x,y)So using this notation, the first example above becomesval x = (let val 'a id : 'a -> 'a = fn z => z in Id Id end, fn y => y);while the second, which still doesn't type check, becomesval 'a x = (let val id : 'a -> 'a = fn z => z in id id end, fn (y: 'a) => y);Note: There is a discrepancy between the syntax supported by SML/NJ and that of the Definition in the case of explicit type variable bindings in
val recdeclarations. SML/NJ supports the notationval rec 'a f = (fn x: 'a => x);while the Definition would haveval 'a rec f = (fn x: 'a => x);
In the SML '90 formal definition, there were deficiencies in the way that new type names (i.e. internal type identifiers) were managed. These deficiencies would allow a generative type name to be reused for another type while values of the original type were still in the environment. Thus according to the Definition, an expression such aslet datatype A = C of bool in fn (C x) = not x (* : A -> bool *) end let datatype B = C of int in C 2 (* : B *) endcould be successfully type checked because the internal type name for
Acould be reused for
B, meaning that types
Bwould agree and the application would be accepted. This is unsound of course, because it is equivalent to
This problem is an artifact of the way the formal definition was designed; compilers (e.g. SML/NJ) typically implement generative type declarations (datatypes and abstypes) by ensuring that unique names are assigned to each declared type, and they never recycle these names. Since the SML '97 Definition is still vulnerable to this problem, the language has been restricted to ensure soundness of the Definition. The language restrictions ensure that a locally declared datatype cannot escape from the static scope of its declaration. Thus in a
letexpressionlet dec in exp endthe type of the let expression, which is the type of
exp, cannot contain any generative types declared in
dec. This means that an expression likelet datatype t = C in C endshould not type check in SML '97.
Another way that a local generative type could escape from its scope is through type unification, as in the following expression.fn x => let datatype t = C val _ = if true then x else C in 5 endBecause of the conditional expression,
xgets the type
t, and the entire function expression gets the type
t -> int, even though it extends outside the scope of the declaration of
t. So this sort of expression is also not allowed in SML '97.
SML/NJ Deviation: Currently, in SML/NJ both the above expressions still type check, so the new restriction is not currently enforced. As we noted before, this relates to an unsoundness in the Definition that does not affect implementations (unless they adhere too closely to the Definition's technique for managing unique type names!).
In SML '90, there was no direct way for one module to inherit a datatype as a datatype from another module. For instance, suppose I have a datatype
tin a structure
A:structure A = struct datatype t = C | D of t endI can define a new structure
Bthat contains an alias of
tas follows:structure B = struct type t = A.t endbut then
Bdoes not provide access to the data constructors of
t. If I define
Basstructure B = struct type t = A.t val C = A.C val D = A.D endI gain access to the constructors
Das values, but I still can't use them in patterns.
In SML '90, you can work around this problem by using the
opendeclaration:structure B = struct open A endand if
Ahas additional components that
Bis not supposed to export you can use a signature constraint on
Bto filter them out. This is not an ideal solution, because it unnecessarily pollutes the name space within the body of
Bwith all the component names in
To support the sharing of datatypes between modules directly, without the use of
open, SML '97 introduces a new form of datatype declaration: the datatype replication declaration. Its syntax isdatatype tycon = datatype longtyconwhere (as the the Definition),
tyconis an identifier naming a type constructor (in the TyCon name space), and
longtyconis a symbolic path (e.g.
A.t) ending in a type constructor name. A datatype replication makes the
tyconbe an alias for the datatype named by
longtycon, and in addition implicitly declares the data constructor names for
longtycon. Thusstructure B = struct datatype t = datatype A.t endis equivalent to the version of the declaration of
B.Dare defined as constructors of
B.tand can be used in patterns.
A common error with datatype replication declarations is to leave out the
datatypekeyword on the right hand side, as instructure B = struct datatype t = A.t endThis causes
A.tto be treated as a constructor declaration, and an error results because a qualified name is illegal in this case. But if I had writtenstructure B = struct open A datatype t = t endThere would be no complaint from the compiler, because the
ton the right hand side would be treated as a valid data constructor name having nothing to do with the datatype
As we will see in the chapter on modules, there is an analogous datatype replication specification for use in signatures.
realnot an equality type
SML '97 uses the IEEE semantics for floating point operations. Equality for floating point numbers in the IEEE standard does not behave the way equality is expected to behave in SML. For instance,
x = xis not always true for floating point numbers, because
nan = nanis false.
Real.==implements IEEE equality for reals, but because of its unusual behavior we no longer treat
Real.realas an equality type. For consistency, we also reject real literals as constants in patterns. This means that an expression likex = 5.6has to be rewritten asReal.==(x,5.6),and a case expression likecase x of 5.6 => 0 | 3.2 => 1 | ~1.7 => 2 | _ => 3has to be rewritten using conditionals:if Real.==(x,5.6) then 0 else if Real.==(x,3.2) then 1 else if Real.==(x,~1.7) then 2 else 3