CSE 428: Lecture Notes 5

Imperative Programming

We will study the basic concepts of imperative programming by introducing a mini-language of commands.

Syntax

This language is generated by the following grammar, on top of the mini-language of expressions:

Commands

Cmd ::=   Ide := Exp                                           (assignment)
            | read(Ide)                                             (input)
            | print(Exp)                                           (output)
            | if BExp then Cmd else Cmd             (conditional)
            | begin Seqc end                                  (block without declarations)
            | begin Seqd in   Seqc   end                  (block with declarations)
            | while BExp do Cmd                          (indefinite iteration)
            | for Ide := NExp to NExp do Cmd      (definite iteration)

Sequence of commands

Seqc ::= Cmd | Seqc ; Cmd

Sequence of declarations of variables

Seqd ::= Decvar | Seqd ; Decvar
Decvar ::= Ide : Type
Type ::= bool | int

Semantics

As usual, we will use an abstract interpreter to describe the meaning of the constructs in this language. In order to give a more direct intuition, this time we will use a less formal description, i.e. we will give the inductive definition of the interpreter in English instead than in formulas.

It should be remarked that what we will describe in the following corresponds to the structure of an interpreter-based implementation. Of course, it does not mean that all the possible implementations have to be interpretative, they could be compilative as well (or mixed), and the model would be quite different. The important thing is that, whatever the implementation is, it respect the semantics. In other words, from the point of view of an external observer (user), it should not matter whether the language is implemented via an interpreter or via a compiler: in both cases, the result (output) obtained by running a program on a given input should be the same.

The reason why choose to describe an interpreter instead than a compiler is simplicity: An interpreter can be described in a much more abstract and simple way than a compiler. The interpreter-based description does not need to worry about low-level and machine-dependent details like for instance memory-management, and in this way we can concentrate on the features of the language (independently from the machine).

In the interpretative model, one of the basic features of imperative programming is the notion of variable. Intuitively, a variable is a location of memory to which we can associate a name (identifier) and in which we can store a value. The association between names and locations is established by declarations, the associations between locations and values is established by commands.

The interpreter maintains at run time two mappings:

the environment : the set of associations between identifiers of the program and locations
the internal state : the set of associations between locations and values

The precise meaning of a common sentence like "the variable x has value v" is:

The identifier x is associated to a location which contains the value v.

Besides the internal state, we consider also an external state, namely the state of the input and output devices. The internal and external state together will be called state.

The execution of a command depends on the current environment and state, and it has the effect of changing the state (never the environment!). More precisely, the evaluation of a command is a function that, given an initial environment env and state s, produces a final state s'.

Note that in an implementation based on compilation there is no environment, or, to be more precise, the environment is "encoded" in the object code: every occurrence of a variable is replaced with the relative address of a memory location.

It should also be remarked that, in an imperative language without blocks and procedures, there would be no need of making a distinction between environment and state. The interpreter could maintain a mapping directly from variables to values. If we want to describe blocks and procedures, however, the simplest way to do it is by having these two separate notions of environment and state.

We define now the semantics of each construct in our mini-language of commands. In the following, when describing the semantics of a command c, we will call "initial" the state and the environment immediately before the execution of c, and "final" the state immediately after the execution of c.

Assignment

Syntax: x := e
Semantics: Let s, env be the initial state and environment. The final state is obtained from s by evaluating e and storing the result in the location associated to x in env.

Input

Syntax: read(x)
Semantics: Let s, env be the initial state and environment. The final state is obtained from s by taking the next data value available in the input stream and by storing it in the location associated to x in env. The value is "consumed" from the input steam in the sense that the next time we execute a read action, the data considered will be next one in the stream. Note that this command modifies both the internal and the external state.

Output

Syntax: print(e)
Semantics: Let s, env be the initial state and environment. The final state is obtained from s by evaluating e and writing the result in the output stream.

Conditional

Syntax: if e then c1 else c2
Semantics: Let s, env be the initial state and environment. The final state is obtained as follows:

evaluate e in s, env
if the result is true, then execute c1 in s, env
if the result is false, then execute c2 in s, env

Block without declarations

Syntax: begin c₁; c₂; ... ; c_n end
Semantics: Let s₀, env be the initial state and environment. The final state is s_n, obtained as follows:

execute c₁ in s₀, env. Let s₁ be the resulting state.
execute c₂ in s₁, env. Let s₂ be the resulting state.

execute c_n in s_n-1, env. Let s_n be the resulting state.

Note that the environment never changes between commands at the same level in a sequence.

Example: Consider the block begin x := x + 1 ; y := x +1 end
If the initial value of x and y was 0, the final value will be 1 and 2 respectively.

Indefinite iteration

Syntax: while e do c
Semantics: Let s, env be the initial state and environment. The final state is obtained as follows:

evaluate e in s, env
if the result is false, then exit
if the result is true, then execute c in s, env. Let s' be the resulting state. Execute again while e do c in env and s'.

Note that in general the command c (body) should contain some instruction which modifies the state of some variable occurring in e, otherwise while e do c will loop forever.

Example: The following command prints the squares of all numbers from 1 to 100.

begin
x := 0 ;
while x < 100 do
       begin
       x := x +1;
       print(x*x)
       end
end

Example: The following command repeatedly takes in input a number and prints its square until a 0 is encountered.

begin
read(x);
while not x=0 do
       begin
       print(x*x);
       read(x)
       end
end

Definite iteration

This command is present in several imperative languages like Pascal, Algol, Modula etc., but not in C.

Syntax: for i := e1 to e2 do c
Semantics: Let s, env be the initial state and environment. The final state is obtained as follows:

evaluate e1, e2 in s, env, and let v1, v2 be their values
assign v1 to i (the type must be compatible)
as long as the value of v is smaller than or equal to v2, repeatedly execute c ; x := x+1. The first iteration is executed in s, env. Each subsequent iteration is executed in the state determined by previous one. The environment, as usual, does not chance between iterations.

Example: The following command prints the squares of all numbers from 1 to 100.

for i := 1 to 100 do print(i*i)

Note that the values of e1 and e2 are evaluated once and for all before starting the iteration, thus even if we change in c (body) the variables occurring in v2, this will not influence the number of times that c will be executed. Furthermore, in Pascal the value of x cannot be modified inside c (it is a static error). Therefore, the number of times that c will be executed is known (at run time) before starting the iteration. Hence the name "definite iteration".

Example: Consider the following pieces of code:

n := 100 ; for i := 1 to n do begin print(i*i) ; n := n+1 end
n := 100 ; for i := 1 to n do begin print(i*i) ; n := n-1 end
n := 100 ; for i := 1 to n do begin print(i*i) ; i := i -1 end

The first command prints the squares of all numbers from 1 to 100. The final value of n is 200. The second command prints the squares of all numbers from 1 to 100, and the final value of n is 0. The third command is considered an error in Pascal. Usually it is detected at compile time.

Definite iteration vs indefinite iteration

Definite iteration always terminates
Definite iteration is more readable and gives information which can be useful for program optimization
Definite iteration is strictly less expressive: there are problems which can be solved by using the while command, but not by using the for command.

Declarations

Single declaration

Syntax: x : t
Semantics: Let s, env be the initial state and environment. The final environment env' is obtained by associating to x a new location L (allocation of L), suitable to contain data of type t. More formally, env' = env[L/x].

Sequence of declarations

Syntax: d₁; d₂; ... ; d_n
Semantics: Let s, env₀ be the initial state and environment. The final environment is env_n, obtained as follows:

evaluate d₁ in s, env₀. Let env₁ be the resulting environment.
evaluate d₂ in s, env₁. Let env₂ be the resulting environment.

evaluate d_n in s, env_n-1. Let env_n be the resulting environment.

Block with declarations

Syntax: begin d in c₁; c₂; ... ; c_n end
Semantics: Let s₀, env be the initial state and environment. The final state is s_n, obtained as follows:

evaluate the sequence of declarations d in s₀, env. Let env' be the resulting environment.
execute c₁ in s₀, env'. Let s₁ be the resulting state.
execute c₂ in s₁, env'. Let s₂ be the resulting state.

execute c_n in s_n-1, env'. Let s_n be the resulting state.

Note that at the end of the block the environment returns to be env. The interpreter treats the environment according to LIFO discipline: each time a block is entered, its declaration part is evaluated and new associations are added to the environment. (And a new association for x hides (shadows) the ones which were possibly present in the environment already.) At the end of block, the corresponding associations are eliminated.

Local and non-local variables The variables which are declared in d are called "local" to the block. All the others are non-local.

Scope The way the environment is built is related to the notion of scope: The scope of a declaration is the part of the program in which this declaration is effective, and it is defined to be the body of the block in which the declaration occurs, with the exclusion of those sub-blocks in which the same identifier is re-declared. We can rephrase this by saying that the declaration which counts, for the occurrence of an identifier in a command, is the one in the most internal block which contain such occurrence.

Note that the notion of scope and the way the environment is treated is exactly the same as in the mini-language of expressions. And it is the same, as we will see, in functional programming.

The possibility of making names local to a block, and the structured notion of scope, is one of the fundamental principles of structured programming. It is very convenient especially for the modular development of large programs: Two persons developing different blocks (within the same program) can use their favorite names and, as long as they declare them local, they don't have to check that their names do not interfere.

Example: Consider the following command:

begin
      x : int
      in
      x := 1;
      begin
            x : int ; y : int
            in
            x := 2;
            begin
                  z : int
                  in
                  z := 3;
                  y := z
             end;
             print(x + y)
       end;
       print(x)
end

In this example, there are three nested blocks. Let us call them A (the most external), B and C (the most internal). In A there is one local variable x. In B there are two local variables: x (seen as different from previous one) and y. In C there is a local variable z and two non local variables x and y.

Let env be the initial environment, and s be the initial (internal) state.

Inside block A, after the declaration and the execution of x := 1 the environment is env1 = env[L/x] (where L is a new location), and the state is s1 = s[1/L]. (From now on, we will not use the tilde character anymore to distinguish values from their representation; it should be clear from the context).
Inside block B, after the declarations and the execution of x := 2 the environment is env2 = env1[L1/x][L2/y] (where L1, L2 are two new locations, and in particular they are different from L and different from each other), and the state is s2 = s1[2/L1].
Inside block C, after the declaration and the execution of z := 3 ; y := z the environment is env3 = env2[L3/z] (where L3 is a new location, in particular it is different from L, L1 and L2), and the state is s3 = s2[3/L3][3/L2].
Inside block B, just before the execution of print(x + y), the environment is env2 and the internal state is still s3 (except that as a consequence of closing block C the variable L3 is now deallocated, i.e. it is free again, and it is not accesible anymore from the program. So the part of the state which is actually "visible" is s2[3/L2].) The effect of the command print(x + y) is to modify the external state by printing 5.
Inside block A, just before the execution of print(x), the environment is env1 and the internal state remains the same, except that now also the variables L1 and L2 are deallocated, i.e. set free again and not accessible anymore from the program. The part of the state which is "visible" thus coincides with s[1/L]. The effect of the command print(x) is to modify the external state by printing 1.

Note: The definition of the interpreter given in these pages is actually an informal version of a natural semantic definition. The interpretation of a command, in fact, can be seen as a relation exec that, given a command c and an initial environment env and state s, gives as result another state s'. Using a notation analogous to the the one used for expressions, this relation can be denoted by:

env, s |- c exec s'

The semantic definition of the commands given above can be reformulated as inference rules defining this relation. For instance, the rule for assignment would be:

env, s |- e eval v env(x) = L
______________________________________________

env, s |- x := e exec s[v/L]