CSE 428: Evaluator Lecture

Metaprogramming:

An Evaluator for a Simple Language

To build an evaluator, first we should understand the concrete syntax (grammar) of a language. We can use it to write an ML datatype to represent the abstract syntax of a program. Then we can write an evaluation function which takes an object of that datatype and returns the result.

Suppose we have a simple expression language with the following functionalities:

Integer and boolean constants
Variable references
Arithmetic Expressions (plus and minus)
Relational (boolean) expressions: Less than, Greater than, Equal
Conditional (if-then and if-then-else) expressions
Let expressions with a single declaration
Parentheses

We can generate a concrete syntax for our language:

 Op ::= PlusOp | MinusOp | LTOp | GTOp | EqualOp

 Exp ::= N | true | false | Id | ( Exp )
  |      Exp Op Exp
  |      if Exp then Exp else Exp
  |      let Id = Exp in Exp end

The grammar above allows us to form our abstract syntax in ML:

 datatype Oper = PlusOp  |  MinusOp  |  LTOp
              |  GTOp  |  EqualOp

 datatype Exp = IntConst of int
            |   BoolConst of bool
            |   Var of string
            |   Op of (Exp * Oper * Exp)
            |   If of (Exp * Exp * Exp)
            |   Let of (string * Exp * Exp)

Program Representation

Now we can represent programs using the expression constructors, which take "arguments" representing constants and subexpressions. Take the following simple expressions:

 26

 4 + x

 if x - 7 < y
 then x
 else y - x
 
 let x = 4 in
   if x > y then
     x - y
   else
     x + 1
 end

NOTE: Some of these expressions would produce an error in a typechecker.

Ordinarily, a parser would convert these into abstract syntax for us. However, rather than writing a parser (UGH!), we can write small programs directly in abstract syntax.

The programs above would be translated as follows:

Expression:	SML Expression Structure:
`26`	`IntConst 26`
`4 + x`	`Plus( IntConst 4, Var "x" )`
`if x - 7 < y then x else y - x`	`If( LT( Minus( Var "x", IntConst 7 ), Var "y" ) Var "x", Minus( Var "y", Var "x" ) )`
`let x = 4 in if x > y then x - y else x + 1 end`	`Let( "x", IntConst 4, If( GT( Var("x"), Var("y") ), Minus( Var("x"), Var("y")), Plus( Var("x"), IntConst 1 ) ) )`

These structures are simply a different way of representing the syntax trees (abstract syntax) of each program. Each one would have ML type Exp.

Preliminaries of Evaluating Expressions

Our task is to write an evaluator to calculate the result of an expression in our language. Since the value can be an integer or a boolean, we need to be able to return either type of value. However, we don't want to be able to return any arbitrary expression, so we don't reuse the IntConst and BoolConst expressions. Instead, we must create a new datatype:

datatype Val = IntVal of int  |  BoolVal of bool

In addition, we need to create a datatype for the value environment, a list which contains pairs of type (string * Val), mapping variable names to their values.

type ValEnv = (string * Val) list

We'll call this value environment venv when we refer to it in our evaluator.

Environment Operations:

Lookup and Bind

We need to implement the following things to build a working evaluator:

Functions to add a binding to the environment and look up a variable in it
A function to traverse the structure of an expression and determine its value

We will ignore what happens if our expression is not typed properly (this is really the domain of a typechecker).

Let's think about the lookup and bind functions...

lookup is similar to member -- we need to determine whether an item occurs in a list -- but rather than just returning the boolean value indicating whether it is or isn't in the list, we need to return the value bound to the variable. Therefore lookup will have type (string * ValEnv) -> Val, and it will need to tell us when a variable is not in the environment. Therefore we will create an exception called Undeclared and raise it if a variable is not found in the environment.

Take a look at the following code:

exception Undeclared

 fun lookup( x, nil ) = raise Undeclared   
      (* variable undeclared! throw a fit *)

  |  lookup( x, ( v, vVal )::venv ) =
       if x=v then vVal else lookup(x,venv)

When our evaluator calls lookup, it should ideally be prepared to handle the Undeclared exception, or execution could halt with a message like "Uncaught Exception: Undeclared". However, we will ignore this case here.

We also need a function to add variable bindings to the environment. We will need to give the function a variable and its bound value, and it will return the environment containing the newly-bound variable.

Binding Variables and Evaluation

Our bind function is fairly simple; it should have type (string * Val * ValEnv) -> ValEnv, and will essentially cons the new binding onto the front of the list. The code looks a little something like this:

fun bind( s, v, venv ) = (s,v)::venv

Now that we can add variables and their bindings, we have everything we need to typecheck expressions.

Note thatlookup and bind actually have the following generic types:

lookup : ('a * ('a * 'b) list) -> 'b
bind : ('a * 'b * ('a * 'b) list) -> ('a * 'b) list

Our eval function should take an initial value environment (presumably empty at the start) and an expression and return the value of the expression. We can work some magic here to make our life just a bit easier:

fun eval e =
  let
    fun ev(expr, venv) = ... (* write our cases here *)
  in
    ev(e, nil)
  end

Note: We should raise any exceptions here if there are any undeclared identifiers or mistyped expressions. We don't, for brevity's sake.

Now we can look at the rules for operational semantics (evaluation).

Getting Value Out of Expressions:

the eval Function

We'll consider the rules for each kind of expression one at a time, allowing us to write the eval function case-by-case. (We'll use the symbol => to mean "evaluates to".)

The first three rules are:

N => N
true => true
false => false

This provides us with the following two cases:

fun ev( IntConst n, venv ) = IntVal n
 |  ev( BoolConst b, venv ) = BoolVal b

The other cases are less trivial (for example, binary operations). The rule is as follows:

e1 Op e2 => v3if
e1 => v1, e2 => v2,and
v3 = v1 Op v2

We will use a generalized function of type (Val * Op * Val) -> Val to compute the value of e1 Op e2:

fun Operate(IntVal n1, PlusOp, IntVal n2) = IntVal(n1 + n2)
 |  Operate(IntVal n1, MinusOp, IntVal n2) = IntVal(n1 - n2)
 |  Operate(IntVal n1, LTOp, IntVal n2) = BoolVal(n1 < n2)
 |  Operate(IntVal n1, GTOp, IntVal n2) = BoolVal(n1 > n2)
 |  Operate(v1, EqualOp, v2) = BoolVal(v1 = v2)

Which allows us to write:

 |  ev( Op(e1, oper, e2), venv ) =
      let val v1 = ev( e1, venv );
          val v2 = ev( e2, venv )
      in
        Operate(v1, oper, v2)
      end

Evaluating variables and let and if expressions

The rule for variable evaluation is:

x => n (x is a variable name)

if
venv(x) = n

This means we have to look up the variable in the environment and get its value (as bound in the environment). Now we can write the simple pattern:

 |  ev( Var x, venv ) = lookup(x,venv)

The rules for if expressions are:

if e1 then e2 else e3 => v2ife1 => trueande2 => v2 if e1 then e2 else e3 => v3ife1 => falseand
e3 => v3

The code to implement this is:

 |  ev( If(e1, e2, e3), venv ) =
      let val BoolVal v1 = ev( e1, venv )
      in
        if v1 then ev( e2, venv ) 
              else ev( e3, venv )
      end

And, lastly, the evaluation rule for a let is:let x = e1 in e2 => v2(x is a variable name)if
e1 => v1
and
e2 => v2 ***

Here we have to call the bind function after evaluating e1 so that we can assign x the value of e1 in the environment.

The function pattern looks like:

 |  ev( Let(x, e1, e2), venv ) =
      let val v1 = ev( e1, venv );
          val v2 = ev( e2, bind(x, v1, venv )
      in
        v2
      end

The Big Picture:

Here is the code all put together:

 datatype Oper = PlusOp  |  MinusOp  |  LTOp
              |  GTOp  |  EqualOp

 datatype Exp = IntConst of int
            |   BoolConst of bool
            |   Var of string
            |   Op of (Exp * Oper * Exp)
            |   If of (Exp * Exp * Exp)
            |   Let of (string * Exp * Exp)

datatype Val = IntVal of int  |  BoolVal of bool

type ValEnv = (string * Val) list

exception Undeclared

fun lookup( x, nil ) = raise Undeclared   
      (* variable undeclared! throw a fit *)
 |  lookup( x, ( v, vVal )::venv ) =
       if x=v then vVal else lookup(x,venv)

fun bind( s, v, venv ) = (s,v)::venv

fun eval e =
  let
    fun Operate(IntVal n1, PlusOp, IntVal n2) = IntVal(n1 + n2)
     |  Operate(IntVal n1, MinusOp, IntVal n2) = IntVal(n1 - n2)
     |  Operate(IntVal n1, LTOp, IntVal n2) = BoolVal(n1 < n2)
     |  Operate(IntVal n1, GTOp, IntVal n2) = BoolVal(n1 > n2)
     |  Operate(v1, EqualOp, v2) = BoolVal(v1 = v2)

    fun ev( IntConst n, venv ) = IntVal n
     |  ev( BoolConst b, venv ) = BoolVal b
     |  ev( Op(e1, oper, e2), venv ) =
          let val v1 = ev( e1, venv );
              val v2 = ev( e2, venv )
          in
            Operate(v1, oper, v2)
          end
     |  ev( Var x, venv ) = lookup(x,venv)
       |  ev( If(e1, e2, e3), venv ) =
          let val BoolVal v1 = ev( e1, venv )
          in
            if v1 then ev( e2, venv ) 
            else ev( e3, venv )
          end
     |  ev( Let(x, e1, e2), venv ) =
          let val v1 = ev( e1, venv );
              val v2 = ev( e2, bind(x, v1, venv )
          in
            v2
          end
  in
    ev(e, nil) (* call the evaluator with an empty environment *)
  end

Now that we've learned how to build an expression evaluator in ML, you can go out and write your own typechecker to go with it!

Please email me, Mike Nidel (nidel@cse.psu.edu) or Dr. Palamidessi if you have any questions regarding today's material.