Remember that eval is a relation between terms which, intuitively, represents the evaluation of a term to its value. More precisely, eval relates a term to its canonical form (the "value", which is also a term). It can be proved that eval is deterministic, namely if M eval N and M eval P hold, then N = P. For this reason, eval is actually a function. And of course, its type is term -> term.

We will see that it is really easy to build an interpreter following the rules of the operational semantics. We will use ML as the interpretation language, because it is particularly suited for symbolic computation. ML indeed stands for "Meta Language", i.e. a language meant to describe and manipulate other langauges.

In these notes we will see how to build an ML interpreter for the eager PCF. The lazy version is the topic of an assignmenmt.

Let us recall the definition of the syntax of PCF (enriched with lists)

Term ::= Var| \Var.Term | Term Term % lambda terms | Num | true | false % numerical and boolean constants | Term Op Term % numerical ops and comparison | (Term,Term) | fst Term | snd Term % pairs | Term::Term | nil | hd Term | tl Term % lists | if Term then Term else Term % conditional | let Var = Term in Term % term with a local declaration | fix Term % the fixpoint operator

The typical interpretation technique for avoiding substitution is the use of environments. An environment is a function which associates variables to terms (or a list of associations variables-terms). Instead of making a substitution of, say, M for x, we enrich the environment with the association between x and M. Then, whenever we will need to evaluate a term containing x, we will use the environment to obtain the value of x.

The evaluation relation will have to be modified so to take into account the dependency on the environment. We will write:

r |- M eval Nto mean that M evaluates to N in the environment r.

The rules of the eager semantics are the following:

- Canonical terms
--------------- for any number n r |- n eval n

--------------------- ----------------------- r |- true eval true r |- false eval false

--------------------- r |- \x.M eval \x.M

- Variables. Note that the term corresponding to a
variable, in the environment, may be not in canonical form, because
it may be of the form fix M. Therefore we need to evaluate it.
(This is even more necessary in the lazy semantics, since the
lazy application rule extablishes links between variables
and unevaluated terms.)
r |- (r x) eval M ------------------- r |- x eval M

- Numerical and comparison operators
r |- M eval P r |- N eval Q -------------------------------- op is +,-,*,/,=. r |- (M op N) eval (P sop Q)

In the above, sop stands for the semantic counterpart of op. For instance, if op is the symbol +, then 5 sop 3 is 8. - Pairs (eager)
r |- M eval P r |- N eval Q --------------------------------- r |- (M,N) eval (P,Q) r |- M eval (P,Q) r |- M eval (P,Q) --------------------- --------------------- r |- (fst M) eval P r |- (snd M) eval Q

- Lists (eager)
r |- M eval P r |- N eval Q --------------------------------- ------------------- r |- (M::N) eval (P::Q) r |- nil eval nil r |- M eval (P::Q) r |- M eval (P::Q) -------------------- -------------------- r |- (hd M) eval P r |- (tl M) eval Q

- Conditional statement
r |- M eval true r |- N eval Q r |- M eval false r |- P eval Q ----------------------------------- ------------------------------------ r |- (if M then N else P) eval Q r |- (if M then N else P) eval Q

- Application (eager). The notation r[x|->Q] stands for
the environment obtained by updating r with the binding between
x and Q
r |- M eval (\x.P) r|- N eval Q r[x|->Q] |- P eval R ------------------------------------------------------------ r |- (M N) eval R

- The let construct
r |- M eval Q r[x|->Q] |- N eval P -------------------------------------- r |- (let x = M in N) eval P

- Fixpoint
r |- M eval (\x.P) r[x|->(fix M)] |- P eval Q ------------------------------------------------- r |- (fix M) eval Q

One problem is related to the issue of lexical scope (or static scope). The above semantics, in fact, reflects dynamic scope, while a correct treatment of the the substitution-based semantics requires static scope.

As an example, consider the following term:

let x = 1 in let f = \y. y+x in let x = 2 in f 5With the substitution-based semantics the value of this term is 6: the binding for x, in the definition of f, is the one valid at declaration time (static scope). With the environment-based semantics described above, on the contrary, the result is 7. This is because the binding for x, in f, is the one valid at execution time, i.e. when f is used (dynamic scope).

Another problem is illustrated by the following example:

(\f. let x = 2 in f 5) (let x = 1 in \y. x + y)The result of this term clearly should be 6. The above semantics, instead, gives 7.

Note that analogous examples could be given in the pure lambda calculus. In fact, let x = M in N is equivalent (from the operational point of view, in the eager semantics) to (\x.N)M.

Finally there is another reason why the the above semantics is incorrect, and has to do with the fixpoint rule. The problem arises for instance in the evaluation of an expression of the form (fix M) 5, where (fix M) is a term representing, say, the factorial function. When evaluating (fix M), the above fixpoint rule will give as result the abstraction obtained by unfolding the definition of factorial one time, but we lose the link between the fixpoint variable (the first parameter of M) and the term (fix M).

The standard way of solving these problems is by introducing the notion of closure. Basically, a closure is a term together with the environment in which it has to be evaluated. We will represent closures as pairs <term, environment>. Closures are used at the moment of application: when a function application needs to be evaluated, its body must be evaluated in the environment of the closure.

The rules that need to be modified are the following:

- Abstraction
-------------------------- r |- \x.M eval <\x.M, r>

- Application (eager)
r |- M eval <\x.P,r'> r|- N eval Q r'[x|->Q] |- P eval R ---------------------------------------------------------------- r |- (M N) eval R

- Fixpoint
r |- M eval <\x.P,r'> r'[x|-> <fix M,r>] |- P eval Q -------------------------------------------------------- r |- (fix M) eval Q

- Closure
r' |- M eval N ------------------- r |- <M, r> eval N

In the definition of the datatype of terms, we have a case for each production of the grammar, plus a case of "error" to represent the result of an evaluation when something goes wrong. We will assume the following PCF operations on numbers: plus, times, minus, division, and test of equality.

Note: several PCF names need to be represented by a different name, because their name is a keyword in ML. For instance we cannot use nil, let, if, etc.

datatype term = var of string (* variables *) | abs of string * term (* abstractions *) | app of term * term (* applications *) | num of int (* numbers *) | tt | ff (* booleans *) | plus of term * term (* aritmetical operation *) | minus of term * term (* aritmetical operation *) | times of term * term (* aritmetical operation *) | divis of term * term (* aritmetical operation *) | equal of term * term (* comparison operation *) | pair of term * term (* pair *) | fst of term | snd of term (* projections *) | cons of term * term (* list constructor *) | empty (* empty list *) | head of term | tail of term (* head and tail of lists *) | ite of term * term * term (* conditional *) | letvar of string * term * term (* local declaration *) | fix of term (* fixpoint *) | closure of term * (string * term) list (* closure *) (* term * (string * term) is an environment (see below) *) | error; (* erroneous situation *)

For the environments, we can use lists of pairs (associations) variable-term. Another solution, mathematically more elegant, would be to use functions from variables to terms. This is made possible by the higher-order capabilities of ML. However, the second solution would make the tracing of the program more complicated (in case of error), because functions are not displayed by ML. Thus we adopt here the first solution.

type environment = (string * term) list;We also need to define the empty environment. This will simply be the function which associates an undefined value (error) to any variable. In fact, the empty environment represents the state of the environment before any declaration or passing of actual parameters. In this situation, all variables are undefined, and the attempt of evaluating a variable should give an error.

val emptyenv: environment = [];Finally, we need to define a function to update an environment with a new association variable-term:

fun update (r:environment) (x:string) (M:term) = (x,M)::r ;and a function that, given an environment r and a variable x, gives back the term associated to x in r, or error if there are no associations for x in r:

fun lookup ([]:environment) (x:string) = error | lookup ((y,M)::r) x = if x = y then M else lookup r x;We can now define the interpreter (function eval). The definition follows very closely the rules for the operational semantics. Although very simple, the interpret is complete. The only thing that is missing is a better (interactive and user-friendly) interface. In particular the treatment of the error cases could be much more sophisticated. As it is now, certain erroneous suituations are not treated by the program: it will give a run-time error and abort. This problem is partly reflected by the message "Warning: match nonexhaustive" that we get at compile-time. The enrichment of the interpreter with error treatment is left as an exercise.

fun eval (var x) r = eval (lookup r x) r | eval (abs(x,M)) r = closure(abs(x,M),r) | eval (app(M,N)) r = let val closure(abs(x,M1),r1) = eval M r in eval M1 (update r1 x (eval N r)) end | eval (num n) r = (num n) | eval tt r = tt | eval ff r = ff | eval (plus(M,N)) r = let val (num m) = (eval M r) and (num n) = (eval N r) in num (m + n) end | eval (minus(M,N)) r = let val (num m) = (eval M r) and (num n) = (eval N r) in num (m - n) end | eval (times(M,N)) r = let val (num m) = (eval M r) and (num n) = (eval N r) in num (m * n) end | eval (divis(M,N)) r = let val (num m) = (eval M r) and (num n) = (eval N r) in num (m div n) end | eval (equal(M,N)) r = let val (num m) = (eval M r) and (num n) = (eval N r) in if n = m then tt else ff end | eval (pair(M,N)) r = pair(eval M r,eval N r) | eval (fst M) r = let val pair(P,Q) = eval M r in P end | eval (snd M) r = let val pair(P,Q) = eval M r in Q end | eval (cons(M,N)) r = cons(eval M r,eval N r) | eval empty r = empty | eval (head M) r = let val cons(P,Q) = eval M r in P end | eval (tail M) r = let val cons(P,Q) = eval M r in Q end | eval (ite(M,N,P)) r = (case (eval M r) of tt => eval N r | ff => eval P r) | eval (letvar(x,M,N)) r = eval N (update r x (eval M r )) | eval (fix M) r = let val closure(abs(x,M1),r1) = eval M r in eval M1 (update r1 x (closure(fix M,r))) end | eval (closure(M,r1)) r = eval M r1 | eval error r = error;

The definition of the term M in PCF is the following:

let fact = fix \f. \n. if n = 0 then 1 else fact(n-1) * n in fact 5The representation of M in the syntax of the interpreter will be the following ML term N:

letvar("fact", fix (abs("f", abs("n", ite(equal(var "n",num 0), num 1, times(app(var "f",minus(var "n",num 1)),var "n") ) ) ) ), app(var "fact", num 5) )The evaluation of such term, in the interpreter, will be an ML term of the form:

eval (N) emptyenv

The rule for application in the lazy semantics, therefore, is the following:

r |- M eval <\x.P,r'> r'[x|-> <N,r>] |- P eval R ---------------------------------------------------- r |- (M N) eval RThe other rules to be changed are those for pairs and streams, since in the lazy semantics their evaluation is also suspended:

Pairs (lazy)

------------------------------- r |- (M,N) eval (<M,r>,<N,r>) r |- M eval (P,Q) r |- P eval R ------------------------------------ r |- (fst M) eval R r |- M eval (P,Q) r |- Q eval R ------------------------------------ r |- (fst M) eval R

Streams (lazy)

r |- M eval P -------------------------------- r |- (M::N) eval (P::<N,r>) r |- M eval (P::Q) -------------------- r |- (hd M) eval P r |- M eval (P::Q) r |- Q eval R ------------------------------------- r |- (tl M) eval R

M ::= let fun Var Var = Term in TermFor instance, one can have

let fun f x = M in Nwhere f is the name of the function, x is the parameter, M is the body of the function, and N is the term in the scope of the declaration, where f can be used. Note that f can occur also in M (recursive call).

let fun fact n = if n = 0 then 1 else n * fact(n-1) in fact 4The intended result of the above term is 24.

r[f|-> <f,\x.M,r>] |- N eval P ------------------------------------ r |- (let fun f x = M in N) eval PThe notation <f,\x.M,r> is like a closure, but it must be treated recursively at the moment of the application of f. We will call it recursive closure. The difference with a closure is that its evaluation is dove recursively, in both the eager and the lazy semantics:

r'[f|-> <f,M,r'>] |- M eval N ------------------------------- r |- <f,M,r'> eval NThe rules for the application remain the same, for both the eager and the lazy semantics. We show now how to modify the eager interpreter. First of all, we need to enrich the datatype with the new let fun construct and the new kind of closure:

datatype term = ... | letfun of string * string * term * term (* recursive definition *) | recclosure of string *term * environment (* recursive closure *)Next, we need to modify the eval function in the following way (we write only the modifications):

fun eval ... | eval (letfun(f,x,M,N)) r = eval N (update r f (recclosure(f,abs(x,M),r))) | eval (recclosure(f,M,r1)) r = eval M (update r1 f (recclosure(f,M,r1))) ...

(* Interpreter of eager PCF in ML *********************************** *) (* datatype term (same as before) *********************************** *) datatype term = var of string (* variables *) | abs of string * term (* abstractions *) | app of term * term (* applications *) | num of int (* numbers *) | tt | ff (* booleans *) | plus of term * term (* aritmetical operation *) | minus of term * term (* aritmetical operation *) | times of term * term (* aritmetical operation *) | divis of term * term (* aritmetical operation *) | equal of term * term (* comparison operation *) | pair of term * term (* pair *) | fst of term | snd of term (* projections *) | cons of term * term (* list constructor *) | empty (* empty list *) | head of term | tail of term (* head and tail of lists *) | ite of term * term * term (* conditional *) | letvar of string * term * term (* local declaration *) | fix of term (* fixpoint *) | closure of term * (string * term) list (* closure *) (* term * (string * term) is an environment (see below) *) | error; (* erroneous situation *) (* type environment (same as before) ******************************* *) type environment = (string * term) list; val emptyenv: environment = []; fun update (r:environment) (x:string) (M:term) = (x,M)::r ; fun lookup ([]:environment) (x:string) = error | lookup ((y,M)::r) x = if x = y then M else lookup r x; (* Function eval **************************************************** *) fun eval (var x) r = eval (lookup r x) r | eval (abs(x,M)) r = closure(abs(x,M),r) | eval (app(M,N)) r = ( case eval M r of closure(abs(x,M1),r1) => eval M1 (update r1 x (eval N r)) | x => error ) (* any other case is an error *) | eval (num n) r = (num n) | eval tt r = tt | eval ff r = ff | eval (plus(M,N)) r = ( case (eval M r, eval N r) of (num m, num n) => num(m + n) | x => error ) | eval (minus(M,N)) r = ( case (eval M r, eval N r) of (num m, num n) => num(m - n) | x => error ) | eval (times(M,N)) r = ( case (eval M r, eval N r) of (num m, num n) => num(m * n) | x => error ) | eval (divis(M,N)) r = ( case (eval M r, eval N r) of (num m, num n) => num(m div n) | x => error ) | eval (equal(M,N)) r = ( case (eval M r, eval N r) of (num m, num n) => if m = n then tt else ff | x => error ) | eval (pair(M,N)) r = pair(eval M r,eval N r) (* we could give error if one component is error *) | eval (fst M) r = ( case eval M r of pair(P,Q) => P | x => error ) | eval (snd M) r = ( case eval M r of pair(P,Q) => Q | x => error ) | eval (cons(M,N)) r = cons(eval M r,eval N r) (* we could give error if one component is error *) | eval empty r = empty | eval (head M) r = ( case eval M r of cons(P,Q) => P | x => error ) | eval (tail M) r = ( case eval M r of cons(P,Q) => Q | x => error ) | eval (ite(M,N,P)) r = ( case (eval M r) of tt => eval N r | ff => eval P r | x => error ) | eval (letvar(x,M,N)) r = eval N (update r x (eval M r )) | eval (fix M) r = ( case eval M r of closure(abs(x,M1),r1) => eval M1 (update r1 x (closure(fix M,r))) | x => error ) | eval (closure(M,r1)) r = eval M r1 | eval error r = error;Note: Instead of using a case statement, we could have used exceptions

error of stringThen, each time an error is generated, the result of eval should be

error(< cause of the error >)Like for instance

fun lookup ([]:environment) (x:string) = error("undeclared variable"); | lookup ((y,M)::r) x = if x = y then M else lookup r x;