CSE 428: Lecture Notes 9


Memory management and activation records

In the compilation-based implementation of a language, the source code is translated into a program in the language of the machine. Usually one instruction in the source corresponds to several instructions at the machine level.

The variables of the source languages are mapped into memory addresses.

Example

Source instruction
   x = x + 1;
Translated code (in some assembly language). Assume that x is mapped into the memory location 1010 and that operations can only be performed on registers. Let R1 be a register.
   LOAD 1010,R1  //copy content of location 1010 in R1
   ADD  R1,#1    //add constant 1 to R1
   STO  R1,1010  //copy content of R1 in location 1010
One of the main issues is the association between variables and locations: it is in general impossible to know at compile time how many variables are going to be created at run time (because of recursive procedures containing local variables and because of dynamic variables). Hence we cannot solve all the allocations at compile time. Additionally, we want to be efficient in the management of the memory: We only want to allocate storage for a variable when needed, and we want to deallocate when possible.

We will call lifetime of a variable the period of time during execution in which the variable has storage allocated for it.

We want lifetime to correspond to the need for the variable - must be at least as long as the need. Typically, we compromise and choose standard times for allocation and deallocation:

  1. Global variables: Lifetime is entire runtime of program
  2. Local variables (variables declared in a block/procedure): Lifetime is during activation of procedure
  3. User-allocated variables (aka dynamic variables - variables created with new and destroyed with delete): Lifetime is from user allocation to user deallocation
Correspondingly, we have the following storage allocation policies:
  1. Static Allocation (for globals only)
  2. Dynamic Allocation of local variables
  3. Dynamic Allocation of user-allocated variables

Stack-based allocation

This is the most used technique for imperative languages (although some imperative languages like Pascal have also implementations based on Heap-allocation).

In this kind of implementation, the memory of the machine is divided in three parts:

  1. A space for the globals and the code of the program
  2. The stack, for the locals
  3. The heap, for the dynamic variables
The division between 1 and 2 is only logical: physically the locals and the code are (usually) placed at the base of the stack.

We will discuss here how the stack is handled. Let us consider the various operations that take place when a procedure is called:

  1. Caller processes actual parameters (evaluation, address calculation) and stores them
  2. Caller stores some control information (e.g., return address, dynamic link)
  3. Control transferred from Caller to Callee
  4. Callee allocates storage for locals
  5. Callee executes
  6. Callee deallocates storage for locals
  7. Callee stores return value (if function)
  8. Control transferred back to Caller
  9. Caller deallocates storage used for control information and actual parameters
The memory space in which the parameters, control information, locals, etc. are stored is called Activation Records (AR) or stack frame. At runtime, a procedure call causes the procedure object to be bound to an AR, which is then stored in the stack

Let us summarize the informations that must be present on an AR:

For languages like C and C++, which have no nested procedure declarations, the above information are all what we need in the AR.

In languages with nested procedures, however, the situation is more complicated. Next section is dedicated to discussing this issue.

Nested procedures, static and dynamic scope

In a language that allows nested declarations of procedures, like Pascal, a procedure p might contain occurrences of variables which are neither local to p, nor global: they are local to some other procedure. Such occurrences are called non-local (in p).

There are two possible scoping rules which determine the declaration which should be associated to the occurrence of a non-local variable x in p:

Almost all languages choose static scope, because it makes programs more clear and understandable. It is however more complicated to implement. In particular, we need an additional information in the activation rtecord: the so-called static link (aka access link).

Static link

The static link of an activation record of a procedure p contains the address of the last activation record on the stack of the procedure where p is declared.

In this way, we can always find the addess of a non-local variable x: we just follow the chain of the static links until we "find" a declaration for x. (Actually, in real implementations the number of static links we need to traverse is determined statically, and the address for x in the AR is dertermined by a fixed offset.)

In dynamically-scoped languages we don't need a static link: the declaration valid for a variable x can be found by following the chain of the dynamic links.

Determining the static link

In languages with nested procedure declarations, there is usually the following restriction on ta procedure call: in the tree representing the hierarchy of procedure declarations, the callee cannot be at a lower lever than the caller, unless it is the son:

Example

procedure p                         p
   procedure q                     / \
      <body of q>;                /   \
   procedure r                   q     r
       procedure s                     |
          <body of s>;                 |
       <body of r>;                    s
   <body of p>;   
We have that: In order to determine the static link at run time, it is sufficient to associate at each procedure call, at compile time, the difference in level between the caller and the callee, plus 1. For instance, if s calls q, the number is 2. If s calls p, the number is 3. If r calls s, the number is 0. At run time, such number indicates the number of AR that we have to traverse, starting from the AR of the caller and following the static links, in order to find the AR which the static link of the callee must point to.

Example

   procedure p; 
      var x,y: integer;   // variables x,y local to p
      procedure q(y: integer);   // procedure q local to p 
         begin if x = y then r else write(x) end; // body of q 
      procedure r;        // procedure r local to p 
         var x : integer;    // variable x local to r
         begin x := 2; if y = 2 then q(x) else write(x) end;  // body of r 
      begin    // begin body of p
      x:= 1; 
      y:= 2; 
      q(x)
      end;     // end body of p
We have that an activation of p prints: