Run-time Storage Management

Run-time Storage Management

Our goal at this point is to understand enough about run-time organization to understand the information the compiler must collect while processing declarations.
The key idea is that at run-time all memory will be allocated in blocks. There will be frames (i.e. activation records) for method invocations and objects allocated on the heap in response to the construction of new objects and arrays.
Ultimately, our understanding of run-time storage management will have to include knowledge of how to generate code to determine the address of any block of storage that contains a variable we want to access.
- That is, we will have to figure out where we can find pointers to the current method's frame, the current object, and objects corresponding to all the scopes (i.e. classes) that textually surround the code of the executing method.
For now, however, we will just assume we can find the necessary pointers to blocks of memory and think about the organization within the blocks to make sure that we collect enough information while processing declarations to determine the offset to each variable/method within the blocks of memory we allocate.
As a start, we can identify four types of structures that will be allocated for Woolite programs. (Good news, these will also be sufficient for many other languages).
Method Activation Records
Each time a method is invoked, we will push a block of storage to hold the method's parameters and local variables onto a stack. This block of memory will also hold things like the return program counter and saved values of some critical pointers (like the pointer to the caller's frame and the previously active object).
Object Records
Each time a program constructs an object (not an array), we will allocate space on the heap to hold all of the instance variables declared in the object's class (including inherited variables -- even though we can't access them).
Dispatch tables
When a method is invoked, we need a way to determine how to interpret the name of the method given the actual type of the object involved. To make this possible, each object structure allocated on the heap will contain a pointer to a table of pointers to the code for the method's associated with the type of the object. These don't need to be dynamically allocated. There will be one per class and they can be statically allocated before execution begins.
Arrays
We will also need to allocate blocks of storage to hold the elements of arrays created by constructions. We distinguish these from the allocation of single objects because they won't need to contain dispatch tables.
For variables (i.e. instance variables, local variables, and formal parameters) stored within activation records and objects, the semantic processor's task it to determine the offset to each variable within the block of memory allocated.
- Variables are often allocated in order within the block to which they belong as their declarations are processed.
- To know each variable's offset, all the compiler has to do is keep track of the total amount of space used by all of the previously processed variables.
- In general, this will involve being able to determine the amount of memory required for each variable, i.e. required for the type associated with each variable.
- In Java-like OO languages, this is made somewhat easy by the fact that all values of non-primitive types are represented by pointers to their own block of memory on the heap and therefore all variables of all such types require the same amount of space, the space required to store a pointer.
- In Woolite and on the 34000, there is only one primitive type, int, and it takes the same amount of space as a pointer, 1 word. Therefore, all variables take one word, and the compiler does not have to do any work to determine the type of a variable. The amount of space consumed by n variables is just n.
The actual layout of activation records for methods is strongly influenced by the hardware support for function/method calls on the target system.
- Most machine architectures have special instructions to support calling procedures and allocating memory for their variables. These instructions require the code to follow the architecture's conventions about where variables are located relative to the register that points to the activation record.
To get a feel for the impact of a machine architecture on variable allocations, recall the stack frame layout conventions of the MC68000/WC34000:
A compiler for such a machine needs to keep track of offsets for local variables and parameters separately since they "grow" in different directions.
For Woolite, it is sufficient to simply count the number of parameters and local variables. It is then easy to compute the offset to the nth variable or nth parameter.
The layout of blocks of memory on the heap is slightly complicated by the ability to extend classes.
- Although Woolite does not allow code within the methods of a subclass to access the variables declared in a superclass, if a method associated with the superclass is invoked on an object of the subclass, the method can reference the superclass variables, so each object of a given class must include space for all variables declared in all superclasses.
- If class P is a superclass of B and P declares a variable x, then the offset to x must be the same in P objects as in B objects, because the code we generate for methods declared in P will use P's offset to x when x is referenced even if such a method is invoked on an object of class B.
- This simply means that when we start counting the number of variables in a subclass, the count should start out equal to the total number of variables declared in all superclasses.
In an OO language the supports inheritance, we cannot in general implement method invocations by branching directly to the code for the correct method. Instead, we must build tables of pointers to the code for each classes methods and dynamically select the correct methods to invoke at run-time.
- Again, assume the P is a superclass of B and that P defines a method, m, which is overridden by B.
- If the program includes a variable, o, of type P, then we can assign o to refer to either an object of type P or B (or any other subclass of P).
- As a result, when processing the invocation
```
		o.m( ... )
		
```
  we cannot tell at compile time which version of m should be executed.
- The method executed will depend on the class of the object assigned to o when the invocation is executed, but not the precise object. Therefore we can solve this problem by building a table containing pointers to the method code to use for each method of each class. We will call this a dispatch table.
- To make this work, we have to be able to put the pointer to the code for a method like m at the same offset in the table of pointers for each subclass of P (including those that override m).
- Therefore, as with variables, we will basically assign offsets to methods by counting the total number of methods declared in a class and its superclasses, but only counting methods that do not override inherited methods. Instead, each method that overrides another will be assigned the same displacement as the method it overrides.
In addition to building these tables, we have to be able to find the correct table quickly when executing an invocation like
```
	o.m( ... )
	
```
As a result, each object allocated on the heap will have to include a pointer to the dispatch method for its class.
Arrays don't require dispatch table pointers. Each array will be a collection of integers (which require no dispatch tables) or pointer to objects that each contain their own dispatch table pointers.

Computer Science 434
Department of Computer Science
Williams College

Run-time Storage Management