CSE 428: Lecture Notes 5


Imperative Programming

We will study the basic concepts of imperative programming by introducing a mini-language of commands.

Syntax

This language is generated by the following grammar, on top of the mini-language of expressions:

Commands

Cmd ::=   Ide := Exp                                           (assignment)
            |  read(Ide)                                             (input)
            |  print(Exp)                                           (output)
            |  if  BExp  then  Cmd  else Cmd             (conditional)
            |  begin  Seqc  end                                  (block without declarations)
            |  begin  Seqd  in   Seqc   end                  (block with declarations)
            |  while  BExp  do  Cmd                          (indefinite iteration)
            |  for Ide :=  NExp  to  NExp  do Cmd      (definite iteration)

Sequence of commands

Seqc  ::=  Cmd  |  Seqc  ;  Cmd

Sequence of declarations of variables

Seqd  ::=  Decvar  |  Seqd  ; Decvar
Decvar  ::=   Ide  : Type
Type  ::=  bool  |  int

Semantics

As usual, we will use an abstract interpreter to describe the meaning of the constructs in this language. In order to give a more direct intuition, this time we will use a less formal description, i.e. we will give the inductive definition of  the  interpreter in English instead than in formulas.

It should be remarked that what we will describe in the following corresponds to the structure of an interpreter-based implementation. Of course, it does not mean that all the possible implementations have to be interpretative, they could be compilative as well (or mixed), and the model would be quite different. The important thing is that, whatever the implementation is, it respect the semantics. In other words, from the point of view of an external observer (user), it should not matter whether the language is implemented via an interpreter or via a compiler: in both cases, the result (output) obtained by running a program on a given input should be the same.

The reason why choose to describe an interpreter instead than a compiler is simplicity: An interpreter can be described in a much more abstract and simple way than a compiler. The interpreter-based description does not need to worry about low-level and machine-dependent details like for instance memory-management, and in this way we can concentrate on the features of the language (independently from the machine).

In the interpretative model, one of the basic features of imperative programming is the notion of variable. Intuitively, a variable is a location of memory to which we can associate a name (identifier) and in which we can store a value. The association between names and locations is established by declarations, the associations between locations and values is established by commands.

The interpreter maintains at run time two mappings:

The precise meaning of a common sentence like "the  variable  x  has  value  v" is:
The identifier x is associated to a location which contains the value v.
Besides the internal state, we consider also an external state, namely the state of the input and output devices. The internal and external state together will be called state.

The execution of a command depends on the current environment and state, and it has the effect of changing the state (never the environment!). More precisely, the evaluation of a command is a function that, given an initial environment env and state s, produces a final state s'.

Note that in an implementation based on compilation there is no environment, or, to be more precise, the environment is "encoded" in the object code: every occurrence of a variable is replaced with the relative address of a memory location.

It should also be remarked that, in an imperative language without blocks and procedures, there would be no need of making a distinction between environment and state. The interpreter could maintain a mapping directly from variables to values. If we want to describe blocks and procedures, however, the simplest way to do it is by having these two separate notions of environment and state.

We define now the semantics of each construct in our mini-language of commands. In the following, when describing the semantics of a command c, we will call "initial" the state and the environment immediately before the execution of c, and "final" the state immediately after the execution of c.

Assignment

Syntax:  x := e
Semantics: Let s, env be the initial state and environment. The final state is obtained from s by evaluating e and storing the result in the location associated to x in env.

Input

Syntax: read(x)
Semantics: Let s, env be the initial state and environment. The final state is obtained from s by taking the next data value available in the input stream and by storing it in the location associated to x in env. The value is "consumed" from the input steam in the sense that the next time we execute a read action, the data considered will be next one in the stream. Note that this command modifies both the internal and the external state.

Output

Syntax: print(e)
Semantics: Let s, env be the initial state and environment. The final state is obtained from s by evaluating e and writing the result in the output stream.

Conditional

Syntax:  if then  c1 else  c2
Semantics: Let s, env be the initial state and environment. The final state is obtained as follows:

Block without declarations

Syntax:  begin  c1; c2 ; ... ; cn end
Semantics: Let s0, env be the initial state and environment. The final state is sn, obtained as follows:

Note that the environment never changes between commands at the same level in a  sequence.

Example: Consider the block  begin  x := x + 1 ; y := x +1  end
If the initial value of x and y was 0, the final value will be 1 and 2 respectively.

Indefinite iteration

Syntax:  while do  c
Semantics: Let s, env be the initial state and environment. The final state is obtained as follows:

Note that in general the command c (body) should contain some instruction which modifies the state of some variable occurring in e, otherwise   while do  c   will loop forever.

Example: The following command prints the squares of all numbers from  1 to 100.

begin 
x := 0 ;
while  x < 100  do 
       begin
       x := x +1;
       print(x*x)
       end
end

Example: The following command repeatedly takes in input a number and prints its square until a 0 is encountered.

begin 
read(x);
while  not  x=0  do 
       begin
       print(x*x);
       read(x)
       end
end
 

Definite iteration

This command is present in several imperative languages like Pascal, Algol, Modula etc., but not in C.

Syntax:  for   i := e1  to  e2  do  c
Semantics: Let s, env be the initial state and environment. The final state is obtained as follows:

Example: The following command prints the squares of all numbers from  1 to 100.

               for   i := 1  to  100  do  print(i*i)

Note that the values of e1 and e2 are evaluated once and for all before starting the iteration, thus even if we change in c (body) the variables occurring in v2, this will not influence the number of times that c will be executed. Furthermore, in Pascal the value of x cannot be modified inside c (it is a static error). Therefore, the number of times that c will be executed is known (at run time) before starting the iteration. Hence the name "definite iteration".

Example: Consider the following pieces of code:

  1. n := 100 ; for  i := 1  to  do  begin print(i*i) ; n := n+1 end
  2. n := 100 ; for  i := 1  to  do  begin print(i*i) ; n := n-1 end
  3. n := 100 ; for  i := 1  to  do  begin print(i*i) ; i := i -1 end
The first command prints  the squares of all numbers from  1 to 100. The final value of n is 200. The second command prints the squares of all numbers from  1 to 100, and the final value of n  is 0. The third command is considered an error in Pascal. Usually it is detected at compile time.

Definite iteration vs indefinite iteration

Declarations

Single declaration

Syntax:  x : t
Semantics: Let s, env be the initial state and environment. The final environment env' is obtained by associating to x a new location L (allocation of  L), suitable to contain data of type t. More formally, env' = env[L/x].

Sequence of declarations

Syntax:  d1; d2 ; ... ; dn
Semantics:  Let s, env0 be the initial state and environment. The final environment is envn, obtained as follows: Block with declarations

Syntax:  begin  in  c1; c2 ; ... ; cn end
Semantics: Let s0, env be the initial state and environment. The final state is sn, obtained as follows:

Note that at the end of the block the environment returns to be env. The interpreter treats the environment according to LIFO discipline: each time a block is entered, its declaration part is evaluated and new associations are added to the environment. (And a  new association for x hides (shadows) the ones which were possibly present in the environment already.) At the end of block, the corresponding associations are eliminated.

Local and non-local variables  The variables which are declared in  d  are called "local" to the block. All the others are non-local.

Scope  The way the environment is built is related to the notion of scope: The scope of a declaration is the part of the program in which this declaration is effective, and it is defined to be the body of the block in which the declaration occurs, with the exclusion of those sub-blocks in which the same identifier is re-declared. We can rephrase this by saying that the declaration which counts, for the occurrence of an identifier in a command, is the one in the most internal block which contain such occurrence.

Note that the notion of scope and the way the environment is treated is exactly the same as in the mini-language of expressions. And it is the same, as we will see, in functional programming.

The possibility of making names local to a block, and the structured notion of scope, is one of the fundamental principles of structured programming. It is very convenient especially for the modular development of large programs: Two persons developing different blocks (within the same program) can use their favorite names and, as long as they declare them local, they don't  have to check that their names do not interfere.

Example: Consider the following command:

begin
      x : int
      in
      x := 1;
      begin
            x : int ; y : int
            in
            x := 2;
            begin
                  z : int
                  in
                  z := 3;
                  y :=  z
             end;
             print(x + y)
       end;
       print(x)
end

In this example, there are three nested blocks. Let us call them A (the most external), B and C (the most internal). In A there is one local variable x. In B there are two local variables: x (seen as different from previous one) and y. In C there is a local variable z and two non local variables x and y.

Let env be the initial environment, and s be the initial (internal) state.

  1. Inside block A, after the declaration and the execution of   x := 1  the environment is   env1 =  env[L/x] (where L is a new location), and the state is   s1  =  s[1/L].  (From now on, we will not use the tilde character anymore to distinguish values from their representation; it should be clear from the context).
  2. Inside block B, after the declarations and the execution of  x := 2  the environment is   env2 =  env1[L1/x][L2/y] (where L1, L2 are two new locations, and in particular they are different from L and different from each other), and the state is   s2  =  s1[2/L1].
  3. Inside block C, after the declaration and the execution of  z := 3 ; y := z  the environment is   env3 =  env2[L3/z] (where L3 is a new location, in particular it is different from L, L1 and L2), and the state is   s3  =  s2[3/L3][3/L2].
  4. Inside block B, just before the execution of  print(x + y), the environment is env2 and the internal state is still s3 (except that as a consequence of closing block C the variable L3 is now deallocated, i.e. it is free again, and it is not accesible anymore from the program. So the part of the state which is actually "visible" is  s2[3/L2].) The effect of the command  print(x + y) is to modify the external state by printing 5.
  5. Inside block A, just before the execution of  print(x), the environment is env1 and the internal state remains the same, except that now also the variables L1 and L2 are deallocated, i.e. set free again and not accessible anymore from the program. The part of the state which is "visible" thus coincides with  s[1/L]. The effect of the  command  print(x)  is to modify the external state by printing 1.
Note:  The definition of the interpreter given in these pages is actually an informal version of a natural semantic definition.  The interpretation of a command, in fact, can be seen as a relation exec that, given a command c and an initial environment env and state s, gives as result another state s'. Using a notation analogous to the the one used for expressions, this relation can be denoted by:
env, s  |-  c  exec s'
The semantic definition of the commands given above can be reformulated as inference rules defining this relation. For instance, the rule for assignment would be:

                env, s  |-  e   eval  v         env(x) =  L
       ______________________________________________

                  env, s  |-  x := e    exec   s[v/L]