Metaprogramming:
An Evaluator for a Simple Language
To build an evaluator, first we should understand the concrete syntax (grammar) of a language. We can use it to write an ML datatype to represent the abstract syntax of a program. Then we can write an evaluation function which takes an object of that datatype and returns the result.
Suppose we have a simple expression language with the following functionalities:
We can generate a concrete syntax for our language:
Op ::= PlusOp | MinusOp | LTOp | GTOp | EqualOp Exp ::= N | true | false | Id | ( Exp ) | Exp Op Exp | if Exp then Exp else Exp | let Id = Exp in Exp end
The grammar above allows us to form our abstract syntax in ML:
datatype Oper = PlusOp | MinusOp | LTOp | GTOp | EqualOp
datatype Exp = IntConst of int | BoolConst of bool | Var of string | Op of (Exp * Oper * Exp) | If of (Exp * Exp * Exp) | Let of (string * Exp * Exp)
Program Representation
Now we can represent programs using the expression constructors, which take "arguments" representing constants and subexpressions. Take the following simple expressions:
26 4 + x if x - 7 < y then x else y - x let x = 4 in if x > y then x - y else x + 1 end
NOTE: Some of these expressions would produce an error in a typechecker.
Ordinarily, a parser would convert these
into abstract syntax for us. However,
rather than writing a parser (UGH!), we can write small programs directly in
abstract syntax.
The programs above would be translated as follows:
Expression: |
SML Expression Structure: |
26 |
IntConst 26
|
4 + x |
Plus( IntConst 4,
Var "x" ) |
if x - 7 < y |
If( LT( Minus( Var
"x", IntConst 7 ), Var "y" ) |
let
x = 4 in |
Let(
"x", IntConst 4, If( GT( Var("x"), |
These structures are simply a different
way of representing the syntax trees (abstract syntax) of each program. Each
one would have ML type Exp
.
Our task is to write an evaluator to calculate the result of
an expression in our language. Since the value can be an integer or a boolean,
we need to be able to return either type of value. However, we don't want to
be able to return any arbitrary expression, so we don't reuse the IntConst
and BoolConst
expressions.
Instead, we must create a new datatype:
datatype Val = IntVal of int | BoolVal of bool
In addition, we need to create a datatype for the value environment,
a list which contains pairs of type (string
* Val)
, mapping variable names to their values.
type ValEnv = (string * Val) list
We'll call this value environment venv
when we refer to it in our evaluator.
Environment Operations:
Lookup
and Bind
We need to implement the following things to build a working evaluator:
We will ignore what happens if our expression is not typed properly (this is really the domain of a typechecker).
Let's think about the lookup
and bind
functions...
lookup
is similar to
member
-- we need to determine whether
an item occurs in a list -- but rather than just returning the boolean value
indicating whether it is or isn't in the list, we need to return the value bound
to the variable. Therefore lookup will have type (string
* ValEnv) -> Val
, and it will need to tell us when a variable
is not in the environment. Therefore we will create an exception called Undeclared
and raise it if a variable is not found in the environment.
Take a look at the following code:
exception Undeclared fun lookup( x, nil ) = raise Undeclared (* variable undeclared! throw a fit *) | lookup( x, ( v, vVal )::venv ) = if x=v then vVal else lookup(x,venv)
When our evaluator calls lookup
,
it should ideally be prepared to handle the Undeclared
exception, or execution could halt with a message like "Uncaught
Exception: Undeclared
". However, we will ignore this case
here.
We also need a function to add variable bindings to the environment. We will need to give the function a variable and its bound value, and it will return the environment containing the newly-bound variable.
Our bind
function is
fairly simple; it should have type (string * Val *
ValEnv) -> ValEnv
, and will essentially cons the new binding
onto the front of the list. The code looks a little something like this:
fun bind( s, v, venv ) = (s,v)::venv
Now that we can add variables and their bindings, we have everything we need to typecheck expressions.
Note thatlookup
and
bind
actually have the following generic
types:
lookup : ('a * ('a * 'b) list) -> 'b bind : ('a * 'b * ('a * 'b) list) -> ('a * 'b) list
Our eval function should take an initial value environment (presumably empty at the start) and an expression and return the value of the expression. We can work some magic here to make our life just a bit easier:
fun eval e = let fun ev(expr, venv) = ... (* write our cases here *) in ev(e, nil) end
Note: We should raise any exceptions here if there are any undeclared identifiers or mistyped expressions. We don't, for brevity's sake.
Now we can look at the rules for operational semantics (evaluation).
Getting Value Out of Expressions:
the eval
Function
We'll consider the rules for each kind of expression one at
a time, allowing us to write the eval
function case-by-case. (We'll use the symbol =>
to mean "evaluates to".)
The first three rules are:
N => N
true => true
false => false
This provides us with the following two cases:
fun ev( IntConst n, venv ) = IntVal n | ev( BoolConst b, venv ) = BoolVal b
The other cases are less trivial (for example, binary operations). The rule is as follows:
e1 Op e2 => v3
if
e1 => v1, e2 => v2,
and
v3 = v1 Op v2
We will use a generalized function of type (Val * Op *
Val) -> Val
to compute the value of e1 Op e2:
fun Operate(IntVal n1, PlusOp, IntVal n2) = IntVal(n1 + n2) | Operate(IntVal n1, MinusOp, IntVal n2) = IntVal(n1 - n2) | Operate(IntVal n1, LTOp, IntVal n2) = BoolVal(n1 < n2) | Operate(IntVal n1, GTOp, IntVal n2) = BoolVal(n1 > n2) | Operate(v1, EqualOp, v2) = BoolVal(v1 = v2)Which allows us to write:
| ev( Op(e1, oper, e2), venv ) = let val v1 = ev( e1, venv ); val v2 = ev( e2, venv ) in Operate(v1, oper, v2) end
let
and if
expressions
The rule for variable evaluation is:
x => n
(x is
a variable name)
ifvenv(x) = n
This means we have to look up the variable in the environment and get its value (as bound in the environment). Now we can write the simple pattern:
| ev( Var x, venv ) = lookup(x,venv)
The rules for if expressions are:
if e1 then e2 else e3 => v2
if
and
e1 => true
if
e2 => v2
if e1 then e2 else e3 => v3
e1 => false
and
e3 => v3
The code to implement this is:
| ev( If(e1, e2, e3), venv ) = let val BoolVal v1 = ev( e1, venv ) in if v1 then ev( e2, venv ) else ev( e3, venv ) end
And, lastly, the evaluation rule for a let
is:
(x
is a variable name)
let x = e1 in e2 => v2
if
e1 => v1
and
e2 => v2 ***
Here we have to call the bind
function after evaluating e1
so that we can assign x
the
value of e1
in the environment.
The function pattern looks like:
| ev( Let(x, e1, e2), venv ) = let val v1 = ev( e1, venv ); val v2 = ev( e2, bind(x, v1, venv ) in v2 end
The Big Picture:
Here is the code all put together:
datatype Oper = PlusOp | MinusOp | LTOp | GTOp | EqualOp
datatype Exp = IntConst of int | BoolConst of bool | Var of string | Op of (Exp * Oper * Exp) | If of (Exp * Exp * Exp) | Let of (string * Exp * Exp) datatype Val = IntVal of int | BoolVal of bool type ValEnv = (string * Val) list exception Undeclared fun lookup( x, nil ) = raise Undeclared (* variable undeclared! throw a fit *) | lookup( x, ( v, vVal )::venv ) = if x=v then vVal else lookup(x,venv) fun bind( s, v, venv ) = (s,v)::venv fun eval e = let fun Operate(IntVal n1, PlusOp, IntVal n2) = IntVal(n1 + n2) | Operate(IntVal n1, MinusOp, IntVal n2) = IntVal(n1 - n2) | Operate(IntVal n1, LTOp, IntVal n2) = BoolVal(n1 < n2) | Operate(IntVal n1, GTOp, IntVal n2) = BoolVal(n1 > n2) | Operate(v1, EqualOp, v2) = BoolVal(v1 = v2) fun ev( IntConst n, venv ) = IntVal n | ev( BoolConst b, venv ) = BoolVal b | ev( Op(e1, oper, e2), venv ) = let val v1 = ev( e1, venv ); val v2 = ev( e2, venv ) in Operate(v1, oper, v2) end | ev( Var x, venv ) = lookup(x,venv) | ev( If(e1, e2, e3), venv ) = let val BoolVal v1 = ev( e1, venv ) in if v1 then ev( e2, venv ) else ev( e3, venv ) end | ev( Let(x, e1, e2), venv ) = let val v1 = ev( e1, venv ); val v2 = ev( e2, bind(x, v1, venv ) in v2 end in ev(e, nil) (* call the evaluator with an empty environment *) end
Now that we've learned how to build an expression evaluator in ML, you can go out and write your own typechecker to go with it!
Please email me, Mike Nidel (nidel@cse.psu.edu) or Dr. Palamidessi if you have any questions regarding today's material.