The abstract syntax of the language is specified by the following grammar:
Com ::= Ide := NExp assignment | Com ; Com concatenation | if BExp then Com else Com conditional | while BExp do Com iteration | begin Ide in Com end block with variable declaration | begin Ide alias Ide in Com end block with alias declarationThe intended meaning of a command like
begin x in c endis that a new variable x is introduced, and it is local to the block. The intended meaning of a command like
begin x alias y in c endis that a new name x is introduced, and it is associated to the same location of the variable y. In other words, in c we can access the same variable by two names: x and y.
NExp represents numerical expressions:
NExp ::= Num | Ide | NExp NOp NExp NOp ::= + | * | - | /BExp represents boolean expressions:
BExp ::= true | false | NExp COp NExp | not BExp | BExp BOp BExp COP ::= < | = BOP ::= and | orNum generates the natural numbers, that can be represented as sequences of digits starting with a digit different from 0:
Num ::= 0 | Non_Zero_Digit Seq_Digit Non_Zero_Digit ::= 1 | 2 | 3 | ... | 9 Digit ::= 0 | Non_Zero_Digit Seq_Digit ::= lambda | Digit Seq_DigitIde generates the identifiers, which can be choosen to simply be sequences of letters:
Ide ::= Letter | Letter Ide Letter ::= a | b | c | ... | zNote that the grammar is ambiguous, but we will not worry about that because we assume that it represents the abstract syntax.
We need trees with a different number of subtrees, up to three, depending on the command. For the assignment we need only one subtree; for the concatenation we need two subtrees, for the conditional we need three subtrees, etc. Hence we will declare a structure of the following kind:
class tree{ node* root; tree* first; tree* second; tree* third; public: tree* get_subtree(int n){ switch (n) of { case 1: return first; case 2: return second; case 3: return third; } } ... }The class node will be analogous to the homonimous class seen in the notes about the evaluation of expressions.
class environment{ class assoc_env{ string ide; location loc; } assoc_env assoc; environment* next; ... }We can assume that locations are numbers representing memory addreses.
class state{ class assoc_state{ location loc; int content; } assoc_state assoc; state* next; ... }
The main construct of the imperative language is the command, hence one argument of eval will be the parse tree representing the command to be evaluated. Additionally eval will need an argument representing the environment in which the command is evaluated. Finally, the effect of a command is a change of state. We could do this as a side-effect of eval, but, since the purpose of this definition is to formalize the semantics of imperative languages, we prefer to have the modification of the state expressed explicitly. Thus our eval function will have one more argument representing the state before the evaluating, and the result will be the state resulting from the evaluation.
Since our main purpose is to define the meaning of the imperative constructs, we will try to use as little as possible the imperative characteristics of C++ Thus we will not use the while of C++ to treat the while of our language, but, rather, recursion. Recursion is more general than while, because all high level languages have it. In this way, our interepreter can easily be rewritten by using another language, for instance a functional or a logical langauge. For the same reason, as already explained, we will try to use the notion of state as little as possible.
In order to evaluate commands, we will need to evaluate also expressions, of both numerical and boolean types. We will assume two more eval functions for the purpose, which we will call eval_num and eval_bool respectively. The first one has been explained in previous classes. The second one is analogous.
In the following, we use various methods to access the information in the parse tree and in the environment. We use significant names in the hope that their meaning will be clear.
Note: as usual, the program is written in C++like, meaning that we use features that we find convenient, even if they are not allowed in real C++ programs (for instance, the type string in the switch statement).
state* eval(tree* t, environment* r, state* s){ node* n = t->get_root(); string ty = n->get_type(); switch (ty) { case "assign": { // assignment x := e tree* e = t->get_subtree(1); int k = eval_num(e,r,s); string x = n->get_ide(); location l = r->lookup(x); return s->update(l,k); // change the content of location l to k } case "conc": { // concatenation c1 ; c2 tree* c1 = t->get_subtree(1); tree* c2 = t->get_subtree(2); state* s1 = eval(c1,e,s); return eval(c2,e,s1); } case "cond" : { // if b then c1 else c2 tree* b = t->get_subtree(1); bool bs = eval_bool(b,r,s); if (bs) { tree* c1 = t->get_subtree(2); return eval(c1,e,s); } else { tree* c2 = t->get_subtree(3); return eval(c2,e,s); } } case "while": { // while b do c tree* b = t->get_subtree(1); bool bs = eval_bool(b,r,s); if (bs) { tree* c = t->get_subtree(2); state* s1 = eval(c,e,s) // execute c return eval(t,e,s1); // then execute again while b do c } else { return s; } } case "dec": { // begin x in c end string x = n->get_ide(); location l = new_location(r); // allocate a new location (not used in r) environment* r1 = r->add(x,l); // associate the new location to x state* s1 = s->add_loc(l); // add the new location to the state tree* c = t->get_subtree(1); return eval(c,r1,s1); } case "alias": { // begin x alias y in c end string x = n->get_ide(); string y = t->get_subtree(1)->get_root()->get_ide(); location l = r->lookup(y); // l = location of y environment* r1 = r->add(x,l); // add the association between x and l return eval(c,r1,s) } } }