CSE 428: Lecture Notes 9


Memory management and activation records

In the compilation-based implementation of a language, the source code is translated into a program in the language of the machine. Usually one instruction in the source corresponds to several instructions at the machine level.

The variables of the source languages are mapped into memory addresses.

Example

Source instruction
   x = x + 1;
Translated code (in some assembly language). Assume that x is mapped into the memory location 1010 and that operations can only be performed on registers. Let R1 be a register.
   LOAD 1010,R1  //copy content of location 1010 in R1
   ADD  R1,#1    //add constant 1 to R1
   STO  R1,1010  //copy content of R1 in location 1010
One of the main issues is the association between variables and locations: it is in general impossible to know at compile time how many variables are going to be created at run time (because of recursive procedures containing local variables and because of dynamic variables). Hence we cannot solve all the allocations at compile time. Additionally, we want to be efficient in the management of the memory: We only want to allocate storage for a variable when needed, and we want to deallocate when possible.

We will call lifetime of a variable the period of time during execution in which the variable has storage allocated for it.

We want lifetime to correspond to the need for the variable - must be at least as long as the need. Typically, we compromise and choose standard times for allocation and deallocation:

  1. Global variables: Lifetime is entire runtime of program
  2. Local variables (variables declared in a block/procedure): Lifetime is during activation of procedure
  3. User-allocated variables (aka dynamic variables - variables created with new and destroyed with delete): Lifetime is from user allocation to user deallocation
Correspondingly, we have the following storage allocation policies:
  1. Static Allocation (for globals only)
  2. Dynamic Allocation of local variables
  3. Dynamic Allocation of user-allocated variables

Stack-based allocation

This is the most used technique for imperative languages (although some imperative languages like Pascal have also implementations based on Heap-allocation).

In this kind of implementation, the memory of the machine is divided in three parts:

  1. A space for the globals and the code of the program
  2. The stack, for the locals
  3. The heap, for the dynamic variables
The division between 1 and 2 is only logical: physically the locals and the code are (usually) placed at the base of the stack.

We will discuss here how the stack is handled. Let us consider in detail the various operations that take place when a procedure is called. We describe here what happens in the standard implementation of C and similar ones; other languages there may be some differences, mainly due to different parameter-passing strategies. We consider here call by vale and call by value-result.

  1. Caller allocates storage for formal parameters
  2. Caller evaluates the actual parameters and stores their value in the locations of the formal parameters
  3. Caller allocates storage for some control information (e.g., return address, dynamic link) and stores them.
  4. Control transfers from Caller to Callee
  5. Callee allocates storage for locals and temporaries
  6. Callee executes
  7. Callee deallocates storage for locals and temporaries
  8. Callee stores return value (if function)
  9. Callee stores result of value-result parameters (if any)
  10. Control transfers back to Caller
  11. Caller deallocates storage used for control information and parameters and for return value (if function)
The memory space in which the parameters, control information, locals, etc. are stored is called Activation Record (AR) or stack frame. At runtime, a procedure call causes the procedure object to be bound to a new AR, which is inserted on the stack. More precisely:

Typically, an AR contains storage (locations) for:

For languages like C and C++, which have no nested procedure declarations, the above information are all what we need in the AR.

In languages with nested procedures, however, the situation is more complicated. Next section is dedicated to discussing this issue.

Nested procedures, static and dynamic scope

In a language that allows nested declarations of procedures, like Pascal, a procedure p might contain occurrences of variables which are neither local to p, nor global: they are local to some other procedure. Such occurrences are called non-local (in p).

There are two possible scoping rules which determine the declaration which should be associated to the occurrence of a non-local variable x in p:

Almost all languages choose static scope, because it makes programs more clear and understandable. It is however more complicated to implement. In particular, we need an additional information in the activation record: the so-called static link (aka access link).

Static link

The static link of an activation record of a procedure p contains the address of the last activation record on the stack of the procedure where p is declared.

In this way, we can always find the address of a non-local variable x: we just follow the chain of the static links until we "find" a declaration for x. (Actually, in real implementations the number of static links we need to traverse is determined statically, and the address for x in the AR is determined by a fixed offset.)

In dynamically-scoped languages we don't need a static link: the declaration valid for a variable x can be found by following the chain of the dynamic links.

Determining the static link

In languages with nested procedure declarations, there is usually the following restriction on the procedure call: in the tree representing the hierarchy of procedure declarations, the callee cannot be at a lower lever than the caller, unless it is the son:

Example

procedure p                         p
   procedure q                     / \
      <body of q>;                /   \
   procedure r                   q     r
       procedure s                     |
          <body of s>;                 |
       <body of r>;                    s
   <body of p>;   
We have that: In order to determine the static link at run time, it is sufficient to associate at each procedure call, at compile time, the difference in level between the caller and the callee, plus 1. For instance, if s calls q, the number is 2. If s calls p, the number is 3. If r calls s, the number is 0. At run time, such number indicates the number of AR that we have to traverse, starting from the AR of the caller and following the static links, in order to find the AR which the static link of the callee must point to.

Example

   procedure p; 
      var x,y: integer;   // variables x,y local to p
      procedure q(y: integer);   // procedure q local to p 
         begin if x = y then r else write(x) end; // body of q 
      procedure r;        // procedure r local to p 
         var x : integer;    // variable x local to r
         begin x := 2; if y = 2 then q(x) else write(x) end;  // body of r 
      begin    // begin body of p
      x:= 1; 
      y:= 2; 
      q(x)
      end;     // end body of p
We have that an activation of p prints: