Remember that eval is a relation between terms. More precisely, eval relates a term to its canonical form (which is also a term). It can be proved that eval is deterministic, namely if M eval N and M eval P hold, then N = P. For this reason, eval is actually a function. And of course, its type is term -> term.
We will see that it is really easy to build an interpreter following the rules of the operational semantics. We will use ML as the interpretation language, because it is particularly suited for symbolic computation. ML indeed stands for "Meta Language", i.e. a language meant to describe and manipulate other langauges.
In these notes we will see how to build an ML interpreter for the eager PCF. The lazy version will tbe topic of an assignmenmt.
Let us recall the definition of the syntax of PCF. For simplicity, we will not consider pairs.
Term ::= Var| \Var.Term | Term Term % lambda terms | Num | true | false % numerical and boolean constants | Term Op Term % numerical ops and comparison | if Term then Term else Term % conditional | let Var = Term in Term % term with a local declaration | fix Term % the fixpoint operator
The typical interpretation technique for avoiding substitution is the use of environments. An environment is a function which associates variables to terms (or a list of associations variables-terms). Instead of making a substitution of, say, M for x, we enrich the environment with the association between x and M. Then, whenever we will need to evaluate a term containing x, we will use the environment to obtain the value of x.
The evaluation relation will have to be modified so to take into account the dependency on the environment. We will write:
r |- M eval Nto mean that M evaluates to N in the environment r.
The rules of the eager semantics are the following:
--------------- for any number n r |- n eval n
--------------------- ----------------------- r |- true eval true r |- false eval false
--------------------- r |- \x.M eval \x.M
r |- (r x) eval M ------------------- r |- x eval M
r |- M eval P r |- N eval Q -------------------------------- op is +,-,*,/,=. r |- (M op N) eval (P sop Q)In the above, sop stands for the semantic counterpart of op. For instance, if op is the symbol +, then 5 sop 3 is 8.
r |- M eval true r |- N eval Q r |- M eval false r |- P eval Q ----------------------------------- ------------------------------------ r |- (if M then N else P) eval Q r |- (if M then N else P) eval Q
r |- M eval (\x.P) r|- N eval Q r[x|->Q] |- P eval R ------------------------------------------------------------ r |- (M N) eval R
r |- M eval Q r[x|->Q] |- N eval P -------------------------------------- r |- (let x = M in N) eval P
r |- M eval (\x.P) r[x|->(fix M)] |- P eval Q ------------------------------------------------- r |- (fix M) eval Q
One problem is related to the issue of lexical scope (or static scope). The above semantics, in fact, reflects dynamic scope, while a correct treatment of the the substitution-based semantics requires static scope.
As an example, consider the following term:
let x = 1 in let f = \y. y+x in let x = 2 in f 5With the substitution-based semantics the value of this term is 6: the binding for x, in the definition of f, is the one valid at declaration time (static scope). With the environment-based semantics described above, on the contrary, the result is 7. This is because the binding for x, in f, is the one valid at execution time, i.e. when f is used (dynamic scope).
Another problem is illustrated by the following example:
(\f. let x = 2 in f 5) (let x = 1 in \y. x + y)The result of this term clearly should be 6. The above semantics, instead, gives 7.
Note that analogous examples could be given in the pure lambda calculus. In fact, let x = M in N is equivalent (from the operational point of view, in the eager semantics) to (\x.N)M.
Finally there is another reason why the the above semantics is incorrect, and has to do with the fixpoint rule. The problem arises for instance in the evaluation of an expression of the form (fix M) 5, where (fix M) is a term representing, say, the factorial function. When evaluating (fix M), the above fixpoint rule will give as result the abstraction obtained by unfolding the definition of factorial one time, but we lose the link between the fixpoint variable (the first parameter of M) and the term (fix M).
The standard way of solving these problems is by introducing the notion of closure. Basically, a closure is a term together with the environment in which it has to be evaluated. We will represent closures as pairs (term, environment). Closures are used at the moment of application: when a function application needs to be evaluated, its body must be evaluated in the environment of the closure.
The rules that need to be modified are the following:
-------------------------- r |- \x.M eval (\x.M, r)
r |- M eval (\x.P,r') r|- N eval Q r'[x|->Q] |- P eval R ---------------------------------------------------------------- r |- (M N) eval R
r |- M eval (\x.P,r') r'[x|->(fix M,r)] |- P eval Q -------------------------------------------------------- r |- (fix M) eval Q
r' |- M eval N ------------------- r |- (M,r') eval N
In the definition of the datatype of terms, we have a case for each production of the grammar, plus a case of "error" to represent the result of an evaluation when something goes wrong. We will assume the following PCF operations on numbers: plus, times, minus, division, and test of equality.
datatype term = var of string (* variables *) | abs of string * term (* abstractions *) | app of term * term (* applications *) | num of int (* numbers *) | tt | ff (* booleans *) | plus of term * term (* aritmetical operation *) | minus of term * term (* aritmetical operation *) | times of term * term (* aritmetical operation *) | divis of term * term (* aritmetical operation *) | equal of term * term (* comparison operation *) | ite of term * term * term (* conditional *) | letvar of string * term * term (* local declaration *) | fix of term (* fixpoint *) | closure of term * (string -> term) (* closure *) | error; (* erroneous situation *)
For the environments, we could use lists of pairs (associations) variable-term. However, a more elegant solution is to use functions from variables to terms. This is made possible by the higher-order capabilities of ML.
type environment = string -> term;We also need to define the empty environment. This will simply be the function which associates an undefined value (error) to any variable. In fact, the empty environment represents the state of the environment before any declaration or passing of actual parameters. In this situation, all variables are undefined, and the attempt of evaluating a variable should give an error.
val emptyenv:environment = fn x => error;Finally, we need to define a function to update an environment with a new association variable-term:
fun update (r:environment) (x:string) (M:term) = (fn y => if y = x then M else r y):environment;We can now define the interpreter (function eval). The definition follows very closely the rules for the operational semantics. Although very simple, the interpret is complete. The only thing that is missing is a better (interactive and user-friendly) interface. In particular the treatment of the error cases could be much more sophisticated. As it is now, certain erroneous suituations are not treated by the program: it will give a run-time error and abort. This problem is partly reflected by the message "Warning: match nonexhaustive" that we get at compile-time. The enrichment of the interpreter with error treatment is left as a (useful) exercise.
fun eval (var x) r = eval (r x) r | eval (abs(x,M)) r = closure(abs(x,M),r) | eval (app(M,N)) r = let val closure(abs(x,M1),r1) = eval M r in eval M1 (update r1 x (eval N r)) end | eval (num n) r = (num n) | eval tt r = tt | eval ff r = ff | eval (plus(M,N)) r = let val (num m) = (eval M r) and (num n) = (eval N r) in num (m + n) end | eval (minus(M,N)) r = let val (num m) = (eval M r) and (num n) = (eval N r) in num (m - n) end | eval (times(M,N)) r = let val (num m) = (eval M r) and (num n) = (eval N r) in num (m * n) end | eval (divis(M,N)) r = let val (num m) = (eval M r) and (num n) = (eval N r) in num (m div n) end | eval (equal(M,N)) r = let val (num m) = (eval M r) and (num n) = (eval N r) in if n = m then tt else ff end | eval (ite(M,N,P)) r = (case (eval M r) of tt => eval N r | ff => eval P r) | eval (letvar(x,M,N)) r = eval N (update r x (eval M r )) | eval (fix M) r = let val closure(abs(x,M1),r1) = eval M r in eval M1 (update r1 x (closure(fix M,r))) end | eval (closure(M,r1)) r = eval M r1 | eval error r = error;
eval (letvar("fact", fix (abs("f", abs("n", ite(equal(var "n",num 0), num 1, times(app(var "f",minus(var "n",num 1)),var "n") ) ) ) ), app(var "fact", num 5) ) ) emptyenv ;The code of the interpreter and of the example are here.
The rule for application in the lazy semantics, therefore, is the following:
r |- M eval (\x.P,r') r'[x|->(N,r)] |- P eval R --------------------------------------------------- r |- (M N) eval RAll the other rules remain the same.
(* Interpreter of eager PCF in ML *********************************** *) (* datatype term (same as before)************************************ *) datatype term = var of string (* variables *) | abs of string * term (* abstractions *) | app of term * term (* applications *) | num of int (* numbers *) | tt | ff (* booleans *) | plus of term * term (* aritmetical operation *) | minus of term * term (* aritmetical operation *) | times of term * term (* aritmetical operation *) | divis of term * term (* aritmetical operation *) | equal of term * term (* comparison operation *) | ite of term * term * term (* conditional *) | letvar of string * term * term (* local declaration *) | fix of term (* fixpoint *) | closure of term * (string -> term) (* closure *) | error; (* erroneous situation *) (* type environment (same as before) ******************************* *) type environment = string -> term; val emptyenv:environment = fn x => error; fun update (r:environment) (x:string) (M:term) = (fn y => if y = x then M else r y):environment; (* Function eval **************************************************** *) fun eval (var x) r = eval (r x) r | eval (abs(x,M)) r = closure(abs(x,M),r) | eval (app(M,N)) r = ( case eval M r of closure(abs(x,M1),r1) => eval M1 (update r1 x (eval N r)) | x => error ) (* any other case is an error *) | eval (num n) r = (num n) | eval tt r = tt | eval ff r = ff | eval (plus(M,N)) r = ( case (eval M r, eval N r) of (num m, num n) => num(m + n) | x => error ) | eval (minus(M,N)) r = ( case (eval M r, eval N r) of (num m, num n) => num(m - n) | x => error ) | eval (times(M,N)) r = ( case (eval M r, eval N r) of (num m, num n) => num(m * n) | x => error ) | eval (divis(M,N)) r = ( case (eval M r, eval N r) of (num m, num n) => num(m div n) | x => error ) | eval (equal(M,N)) r = ( case (eval M r, eval N r) of (num m, num n) => if m = n then tt else ff | x => error ) | eval (ite(M,N,P)) r = ( case (eval M r) of tt => eval N r | ff => eval P r | x => error ) | eval (letvar(x,M,N)) r = eval N (update r x (eval M r )) | eval (fix M) r = ( case eval M r of closure(abs(x,M1),r1) => eval M1 (update r1 x (closure(fix M,r))) | x => error ) | eval (closure(M,r1)) r = eval M r1 | eval error r = error;Note: Instead of using a case statement, we could have used exceptions
error of stringThen, each time an error is generated, the result of eval should be
error(< cause of the error >)Like for instance
val emptyenv:environment = fn x => error("undeclared variable");