CSE 428 - Lecture notes

Interpretation of a small imperative language

We illustrate an example of interpreter for a simple imperative language.

The language

The abstract syntax of the language is specified by the following grammar:

   Com ::= Ide := NExp                      assignment
         | Com ; Com                        concatenation
         | if BExp then Com else Com        conditional
         | while BExp do Com                iteration
         | begin Ide in Com end             block with variable declaration
         | begin Ide alias Ide in Com end   block with alias declaration
The intended meaning of a command like
   begin 
      x 
   in c
   end
is that a new variable x is introduced, and it is local to the block. The intended meaning of a command like
   begin 
      x alias y
   in c
   end
is that a new name x is introduced, and it is associated to the same location of the variable y. In other words, in c we can access the same variable by two names: x and y.

NExp represents numerical expressions:

   NExp ::=  Num | Ide | NExp NOp NExp 
   NOp  ::=  + | * | - | /
BExp represents boolean expressions:
   BExp ::= true | false | NExp COp NExp | not BExp | BExp BOp BExp
   COP  ::= < | =
   BOP  ::= and | or
Num generates the natural numbers, that can be represented as sequences of digits starting with a digit different from 0:
   Num ::= 0 | Non_Zero_Digit Seq_Digit
   Non_Zero_Digit ::= 1 | 2 | 3 | ... | 9
   Digit ::= 0 | Non_Zero_Digit
   Seq_Digit ::= lambda | Digit Seq_Digit
Ide generates the identifiers, which can be choosen to simply be sequences of letters:
   Ide    ::=  Letter | Letter Ide
   Letter ::=  a | b | c | ... | z
Note that the grammar is ambiguous, but we will not worry about that because we assume that it represents the abstract syntax.

Structures necessary for the interpreter

We will specify our interpreter in a C++like language.

Parse trees

We need trees with a different number of subtrees, up to three, depending on the command. For the assignment we need only one subtree; for the concatenation we need two subtrees, for the conditional we need three subtrees, etc. Hence we will declare a structure of the following kind:

   class tree{
      node* root;
      tree* first;
      tree* second;
      tree* third;
   public:
      tree* get_subtree(int n){
            switch (n) of {
               case 1: return first;
               case 2: return second;
               case 3: return third;
            }
      }
      ...
   }
The class node will be analogous to the homonimous class seen in the notes about the evaluation of expressions.

Environments

The environment will be a list of associations between names and locations:
   class environment{
      class assoc_env{
          string   ide;
          location loc;
      }
      assoc_env    assoc;
      environment* next;
      ...
   }
We can assume that locations are numbers representing memory addreses.

State

The state will be a list of associations between locations and values:
   class state{
      class assoc_state{
          location loc;
          int      content;
      }
      assoc_state  assoc;
      state*       next;
      ...
   }

The interpreter

We are now ready to outline the function representing the interpreter. We will call such a function "eval" (for "evaluation").

The main construct of the imperative language is the command, hence one argument of eval will be the parse tree representing the command to be evaluated. Additionally eval will need an argument representing the environment in which the command is evaluated. Finally, the effect of a command is a change of state. We could do this as a side-effect of eval, but, since the purpose of this definition is to formalize the semantics of imperative languages, we prefer to have the modification of the state expressed explicitly. Thus our eval function will have one more argument representing the state before the evaluating, and the result will be the state resulting from the evaluation.

Since our main purpose is to define the meaning of the imperative constructs, we will try to use as little as possible the imperative characteristics of C++ Thus we will not use the while of C++ to treat the while of our language, but, rather, recursion. Recursion is more general than while, because all high level languages have it. In this way, our interepreter can easily be rewritten by using another language, for instance a functional or a logical langauge. For the same reason, as already explained, we will try to use the notion of state as little as possible.

In order to evaluate commands, we will need to evaluate also expressions, of both numerical and boolean types. We will assume two more eval functions for the purpose, which we will call eval_num and eval_bool respectively. The first one has been explained in previous classes. The second one is analogous.

In the following, we use various methods to access the information in the parse tree and in the environment. We use significant names in the hope that their meaning will be clear.

Note: as usual, the program is written in C++like, meaning that we use features that we find convenient, even if they are not allowed in real C++ programs (for instance, the type string in the switch statement).

  state* eval(tree* t, environment* r, state* s){
         node* n = t->get_root();
         string ty = n->get_type();
         switch (ty)  {
            case "assign": { // assignment x := e
               tree*  e = t->get_subtree(1);
               int k = eval_num(e,r,s);
               string x = n->get_ide();
               location l = r->lookup(x);
               return s->update(l,k); // change the content of location l to k
            }
            case "conc": { // concatenation c1 ; c2
               tree* c1 = t->get_subtree(1);
               tree* c2 = t->get_subtree(2);
               state* s1 = eval(c1,e,s);
               return eval(c2,e,s1);
            }
            case "cond" : { // if b then c1 else c2
               tree* b = t->get_subtree(1);
               bool bs = eval_bool(b,r,s);
               if (bs) {
                  tree* c1 = t->get_subtree(2);
                  return eval(c1,e,s);
               } else {
                  tree* c2 = t->get_subtree(3);
                  return eval(c2,e,s);
               }
            }
            case "while": { // while b do c    
               tree* b = t->get_subtree(1);
               bool bs = eval_bool(b,r,s);
               if (bs) {
                  tree* c = t->get_subtree(2);
                  state* s1 = eval(c,e,s)  // execute c
                  return eval(t,e,s1);     // then execute again while b do c
               } else {
                  return s;
               }
            }
            case "dec":  { // begin x in c end
               string x = n->get_ide();
               location l = new_location(r); // allocate a new location (not used in r)
               environment* r1 = r->add(x,l); // associate the new location to x
               state* s1 = s->add_loc(l);    // add the new location to the state
               tree* c = t->get_subtree(1);
               return eval(c,r1,s1);  
            }  
            case "alias": { // begin x alias y in c end  
               string x = n->get_ide();
               string y = t->get_subtree(1)->get_root()->get_ide();
               location l = r->lookup(y);     // l = location of y
               environment* r1 = r->add(x,l); // add the association between x and l
               return eval(c,r1,s)
            }
         } 
   }