We can solidify our understanding of the run-time layout of
memory (and get a head start on code-generation) by considering
how to transform the trees produced by the parser to represent
references to variables into trees that explicitly describe
the address calculations and memory references involved.
A reference to an identifier as a variable will be represented
in the syntax tree by an Nrefvar node with an Nident node
for the variable's name as its child.
To actually access a variable in memory at run-time the
hardware must add the displacement to the variable to the
frame pointer for the method or pointer to the object in which
the variable is stored.
We can think of an Nrefvar node as representing the
value stored in the memory location described by its
child.
We already have node types to represent addition (Nplus)
numeric constants (Nconst), and the pointer to the current
object (Nthis),
so if we add a node type to represent a reference to
the active method's frame (NFramePtr), we can explicitly describe
the steps required to access a variable.
There is also a field in the Nrefvar node that can be used to
hold the displacement to a variable relative to a given
memory location.
This leads to the following simple transformation for simple
variable references:
For instance variables local to the class of the current
method, we just specify the offset to the
variable relative to the "this" pointer.
For variables declared as locals in a method and for
method parameters, we use the offset to the variable
relative to the method frame pointer.
Given the organization of the static links, all we need to
do for references to non-local variables is follow the chain
of static links up the correct number of levels. For example,
the following tree could be used to reference a variable declared
two levels above the current class.
Similar transformations can be appled to more complex variables.
Nsubs nodes can be transformed into trees that describe
the subscript calculations required. In such a transformed tree:
the root will again be an Nplus node
the left subtree will just be a transformed version of the
array sub-variable tree (as with the record sub-variable
in the Nselect case).
the right subtree will in general be an Ntimes node multiplying the
subscript expression by the element size. In Woolite, since all
array elements occupy just one word, we can leave out the Ntimes
node.
There will be several advantages to making these transformations.
Because they replace special purpose tree nodes (Nident and
Nsubs) with nodes types that would already be present
in the tree (Nplus, Ntimes), they reduce the number of cases
to be handled by later phases (code generation, optimization).
There is a "hook" included in the format of the Nrefvar node
to support later optimizations:
A terrifying consequence of the transformations I have suggested
is that once they are complete, there would be little information left
in the tree about which variables are being referenced by
the expressions in a program.
To avoid this, the Nrefvar node includes a field that you
may someday use to hold a pointer for a descriptor of
the variable being referenced.