An Intermediate Form for Woolite Programs
The symbol table maintained by your compiler will consist of two main components. The first is a collection of dynamically allocated structures containing one element for each distinct identifier used in the program. We will refer to this collection of structures as the identifier table and to its elements as identifier descriptors. Each of these elements will contain a pointer to the character string representation of the identifier with which it is associated and several other link fields. This table will be created by the scanner.
The second component is a collection of dynamically allocated structures including one element for each distinct declaration or definition in the program. We will refer to this collection of structures as the declaration table and its elements as declaration descriptors. These entries will be created and initialized by the semantic processing phase of your compiler. Each of these elements will include a pointer to the identifier descriptor for the identifier with which it is associated; attribute fields describing important characteristics of this declaration of the identifier (such as whether it is a method, a class, or a variable); and several additional link fields.
In the syntax trees produced as output of the syntactic analysis phase, identifiers will be represented by pointers to their identifier descriptors. During the semantic processing phase, references to identifiers within the syntax tree will be modified so that references to identifier descriptors are augmented with references to the appropriate declaration descriptors.
The process of determining which declaration should be associated with each identifier at different points in the program text will require the use of a third collection of dynamically allocated structures used to represent bindings between identifiers and declarations. In fact, your semantic processing phase should create two sets of binding descriptors.
The first collection of binding descriptors will be used for those identifiers that are associated with declarations based on static scope rules. Your code will include instructions to ensure that if an identifier has a meaning in the scope your compiler is currently procssing, then its identifier descriptor will points to a binding descriptor that in turn points to the declaration descriptor associated with that identifier in the scope. As a result, when processing a class or method definition, you will create both declarations and binding descriptors as you process the declarations in the class or method body.
In order to ensure that bindings that are temporarily hidden while you process a scope are restored when processing of the scope is completed, your code will maintain two types of collections of binding descriptors. For every identifier, you will maintain a stack of all of the active, statically scoped binding descriptors for that identifier. For each scope, you will maintain a list of all of the binding descriptors created for that scope. When you leave a scope, your code will pop the binding stacks of all of the identifiers that were bound in the scope to restore the bindings that were in force in the outer scope.
You will also create binding descriptors for identifiers used as method names. The associations of names with methods do not follow simple nested scope rules. When the code in a Woolite program invokes a method on an object, your compiler must be able to access the declaration descriptor for the method within the class of the object on which it is invoked. That class may not even be accessible within the current scope.
Your code must maintain a hash table used to locate the declaration descriptors for method names that appear in such invocations. Each "bucket" in this hash table will be a list of method name binding descriptors. Each method name binding descriptor will include pointers to the declaration descriptor for the method and a class in which the method was defined or inherited. Given a method name and a pointer to the declaration descriptor for a class, this table will enable one to locate the declaration descriptor for the method if the method is indeed associated with the specified class. This search structure will be created and maintained by the semantic processing phase of your compiler.
The format of the structures used to hold identifier descriptors, declaration descriptors, and binding descriptors is discussed below, after the specification of the syntax trees for Woolite.
As discussed in class, there is a significant difference between the internal nodes of a syntax tree and most of its leaves. Within an internal node, one finds a phrase type and pointers to sub-trees. The leaves, on the other hand, usually hold information about identifiers and constants. In fact, in class I have suggested that rather than actually having separate nodes for the leaves, one could use symbol table entries for leaf nodes.
We will not actually do this in the compilers you build. The reason is a simple, practical one. To generate good error messages, one needs to keep information about where in the source program the text that corresponds to each sub-tree of the syntax tree can be found. We will do this by storing in each node the line number on which the first token that belongs to the phrase the node represents was found. This cannot be done for identifiers if all occurrences of an identifier are represented by a single symbol table entry. Instead, we represent identifiers by nodes that contain the line numbers on which they were found and pointers to the appropriate symbol table entries. Similar nodes will be used for constants.
As part of semantic processing (or as the first step in code generation), you will rewrite the trees that the parser produces for variable references. Basically, while the parser creates trees based on the syntactic structure of the source code, the code generator would prefer trees corresponding closely to the capabilities of the underlying hardware. Variable references, particularly subscripted variables, can be reconstructed by the semantic processing routines so that they explicitly describe much of the addressing arithmetic required by the variable references they represent.
Two special node types are used to support this translation of variable reference subtrees. The first is an internal node type used to represent the root of a variable reference subtree. These nodes will each hold a single pointer to the subtree that describes the variable reference. The other is a node type used to represent references to the frame of the currently executing method. Such method frame nodes do not appear in the trees produced by the parser but are needed to translate variable reference subtrees into a form that explicitly describes the required address arithmetic. These nodes will always appear as leaves in the tree.
This leads to a syntax tree with five distinct node types. As a
result, to specify the general "type" of a syntax tree node, we
use the C union type node
described below.1
/* Union type that combines the 6 structure types used to describe */
/* tree nodes */
typedef union nodeunion {
struct unknode unk;
struct internalnode internal;
struct identnode ident;
struct iconstnode iconstant;
struct sconstnode sconstant;
struct refvarnode var;
} node;
Each of the six node types present in the tree include three common
fields: a field specifying the node's phrase type2 ( type
), a field
specifying the line of the source code on which the first token of the
phrase represented by the node's subtree occurred ( line
), and an
operand
field that may(?) be used in later phases of the complier.
The type unknode
, whose definition is shown below,
/* The type 'unknode' provides a template that can be used to */
/* access the common components found in all node types when */
/* the actual type of the node is unknown. */
struct unknode {
nodetype type;
int line;
union opdesc * operand;
};
allows one to reference these fields in situations
where the actual type of the node is not yet known. For example, if
`root
' is a pointer to a node
of unknown type one can
use the expression:
root->unk.typeto determine its phrase type. One could also use the expression `root->internal.type' or `root->ident.type', but these expressions mis-leadingly suggest that the type of the node is already known. The type
unknode
is provided to support clear coding.
The structure type internalnode
describes the nodes used to
represent almost all of the internal nodes of the tree and several types of
leaf nodes.
struct internalnode {
nodetype type;
int line;
union opdesc * operand;
union nodeunion *child[MAXKIDS]; /* pointers to the node's sub-trees */
};
In addition to the common
type
and line
components found in all nodes, a node
of type internalnode includes a component child
which is an array of
pointers to its children. The number of children of a given node can
be determined from its node type. The syntactic analysis routines
provided conserve memory by only allocating space for the child
pointers actually used by a given internal node. Thus, if this document
indicates that a node
should only have 2 children, its third child pointer should not be
used for any purpose.
There are three types of `internal nodes" that are really leaves of the tree.
Nodes of type Null
are used to represent occurrences of the expression null in the source program,
nodes of type Nthis
are used to represent the expression this, and
nodes of type NFramePtr are used to represent references to the frame of the
executing method.
Nodes of these three types have no children.
The structure types identnode
, iconstnode
, and sconstnode
are used to
represent the leaves of the syntax trees produced by the parser.
Declarations of the structure types are shown below:
/* Nodes of type 'ident' are used for leaf nodes corresponding */
/* to identifiers in the source code. The value in the */
/* 'type' component of such a node will always be 'Nident'. */
struct identnode {
nodetype type;
int line;
union opdesc * operand;
identdesc *ident; /* Pointer to associated identifier descriptor */
decldesc *decl; /* Pointer to associated declaration descriptor */
};
/* Nodes of type 'iconst' are used for leaf nodes corresponding */
/* to character and integer constants in the source code. */
/* The value in the 'type' component of such a node will */
/* always be 'Niconst'. */
struct iconstnode {
nodetype type;
int line;
union opdesc * operand;
int value; /* Integer value of the constant */
int ischar; /* True if this was a character constant */
};
/* Nodes of type 'sconst' are used for leaf nodes corresponding */
/* to string constants in the source code. */
/* The value in the 'type' component of such a node will */
/* always be 'Nsconst'. */
struct sconstnode {
nodetype type;
int line;
union opdesc * operand;
char * value; /* value of the constant */
};
Identifiers are represented
by nodes of type identnode
. The type
component of such
nodes will always be Nident
. The ident
and decl
components of an identnode
are pointers to the appropriate
identifier descriptor and declaration descriptor for the identifier
being referenced. The decl
components of identnode
nodes
are set to NULL (the value 0) by the syntactic analyzer. During
semantic analysis, the correct values should be stored in these
fields.
There is one special group of identnode
s produced by the
syntactic analyzer. These are identnode
s for the keyword int. Technically, int is a keyword rather than an
identifier in Woolite. Treating it as an identifier that has been
declared as a class, however, simplifies various parts of the
compiler. Accordingly, occurrences of int will be
represented by special identnode
s in the syntax tree.
In addition, I will provide a function called init_symtab that
will create a declaration descriptor for int and linked this
declaration descriptor to the identifier descriptor for int
through a binding descriptor.
Each iconstnode
contains two fields beyond the common
type
and line
fields. One is named value
. It holds
the integer value of the constant. The second is a field
named ischar
which is used as a boolean flag indicating whether the
constant found in the source code was a character or an integer. The
type
component of all such nodes will be Niconst
.
Each sconstnode
contains one fields beyond the common
type
and line
fields. It is named value
. It holds
a pointer to the character string that is the value of the constant. The
type
component of all such nodes will be Nsconst
.
There is one additional node type related to subtrees
representing references to variables. Its declarations is shown
in figures *.
Tree nodes of type refvarnode
are used to designate places where
a value should be loaded from a memory address.
/* Refvar nodes are included by the parser as the roots of all */
/* variable reference subtrees. When created by the parser, the */
/* "baseaddr" field will either point to an Nsubs or */
/* Nident node. During semantic analysis the "baseaddr" subtree */
/* will be converted into a subtree describing the calculation of */
/* the memory address for the variable. A "displacement" field is */
/* included to hold a constant offset from the base address to the */
/* variable. */
struct refvarnode {
nodetype type;
int line;
union nodeunion
* baseaddr; /* Subtree describing base address calculation */
int displacement; /* Displacement to variable relative to base addr */
decldesc *vardesc; /* Declaration descriptor for referenced variable */
};
The baseaddr
field of a refvarnode
points to a subtree
that describes the location in memory being referenced. The value
of the displacement
field gives a constant value to be added
to the address of this location before accessing memory. This field is initialized
to 0 in trees created by the parser. The vardesc
field is
intended to point to a declaration descriptor for the variable
referenced. You may ignore this field for now. It may serve a purpose
during an optimization phase later in the project.
As mentioned above, the phrase types Nident
, Niconst
, Nsconst
, Nthis
,
NFramePtr, and
Null
are used to label nodes representing the leaves of
the syntax tree. The phrase types Nrefvar
and NFramePtr
are used to identify the two special node types used to encode variable
reference subtrees. All of the other node phrase names defined in the
enumeration type nodetype
are used to label internal nodes.
There are several important subgroups of node phrase types. One
important group is the group of "list" phrases including Nstmtlist
,
Ndecllist
, and Nactuallist
. These nodes are used to represent
lists of items in the program. In all cases, such nodes take
2 children. The left child ( child[0]
) of a list node points to
the first element of the list (i.e. a statement, expression,
variable definition or whatever element type is appropriate). The
right child ( child[1]
) points to the remainder of the list.
Its value is either NULL ( = 0 ) or a pointer to
another list node of the same type.
Other important groups of phrase types include the statement phrase
types (Nasgn
,
Nvocation
, Nretn
, Nif
and Nwhile
); the
variable phrase types ( Nident
and Nsubs
);
and the expression phrase types (which includes Null
, Nthis
,
and NFramePtr
in addition to all the "unaries" and "binaries"
mentioned in the table below).
All of the phrase names used in internal tree nodes are described in the list below.
Node | Num. of | |||
Type | Children | Description | ||
Nclass | 3 |
Represents a class declaration.
Child[0] is the Nident node for the class name. Child[1] is either NULL or the Nident
node for the class which this definition extends. Child[2] is an Ndecllist node which is the
start of the list of member declarations for the class. This list may contain other class
declarations, method declarations, and variable declarations.
The root of the full program tree will be an Nclass node.
Ndecllist | 2 | List header used to build lists of the three declaration node types: Nvardecl, Nclass, and Nmethod. All three types of declarations may occur in an Ndecllist associated with an Nclass node. Only Nvardecl nodes will be found in the Ndecllist nodes associated with Nmethod nodes (i.e. the parameter lists and local variable lists). |
Nvardecl | 2 | Used to represent instance variable declarations and formal
parameter declarations.
Child[0] is the identifier being declared.
Child[1] is either a pointer to an Nident node for the type of the
name being declared or a pointer to an Narraydim node if the
name is declared to refer to an array.
Remember that a special symbol table entry
is created
during initialization to allow uniform treatment of the
type name int .
| ||
Narraydim | 2 | Used to represent information about the number of dimensions and the size of the first dimension of array types. These nodes appear both in type specifications found in method and variable declarations and in array constructions. Child[0] will either point to another Narraydim node or to an Nident node for a class name or int. Child[1] may refer to an expression describing the array's size. This field will only be used in Narraydim nodes that a) are a part of Narraydim subtrees found in construction expressions and b) have an Nident node as Child[0]. | ||
Nmethod | 4 | Used to represent the definition of a method. Child[0] is an Nident node for the method's name. Child[1] is NULL if the method is declared void. Otherwise, Child[1] will point to an Nident node or Narraydim node describing the method's return type. Child[2] is a ( possibly NULL ) list of Ndecllist nodes containing Nvardecl nodes for the function's formal parameters. Child[3] is an Nbody node for the function's body. | ||
Nbody | 2 | Used to represent the body of a method. Child[0] is a (possible NULL) list of Ndecllist nodes that refer to Nvardecl nodes for the method's local variables. Child[1] is a list of Nstmtlist nodes. | ||
Nstmtlist | 2 | Used to represent statement lists. Child[0] will be one of the following six "statement" phase types or another Nstmtlist node. | ||
Nvocation | 3 | Represents a method invocation (as either a statement or expression). Child[0] will be an Nident node for the method's name. Child[1] will be NULL if the method being invoked using static scope rules to access its name, otherwise it will refer to a subtree representing an expression that describes the object to which the method should be applied. Child[2] points to a (possibly NULL) list of Nactuallist nodes. | ||
Nupvocation | 3 | Used to represent an invocation of a superclass method through the keyword super. The children of this node are identical to that of an Nvocation node (except that Child[1] will always be NULL). | ||
Nasgn | 2 | Represents an assignment statement. Child[0] will be a node of type Nrefvar pointing to a subtree that describes the target of the assignment. Child[1] will be a node whose type is classified as an expression. | ||
Nretn | 1 | Represents a return statement. If an expression was included in the statement, child[0] points to a sub-tree representing the expression. Otherwise, child[0] is NULL. | ||
Nif | 3 | Represents an if statement. Child[0] points to a sub-tree representing the "boolean" expression. Child[1] points to a list of Nstmtlist nodes that represents the then part. If an else part was included, child[2] points to the list of Nstmtlist nodes representing the else part. Otherwise, child[2] is NULL. Note that the last two children will be list nodes even if only a single statement is included for either the then or else part. | ||
Nwhile | 2 | Represents a while statement. Child[0] points to a tree representing the loop termination condition. Child[1] points to a statement list representing the loop body. For loops are rewritten using Nwhile subtrees by the parser. | ||
Nactuallist | 2 | Used to represent lists of actual parameters. Child[0] will be a node of one of the expression phrase types. | ||
Nrefvar | 1 | Nrefvar nodes are stored using the refvarnode type
rather than the internalnode
type. They do, however, appear as internal nodes in the
tree. They appear as the roots of variable subtrees
pointed to by Nasgn nodes and in expression subtrees.
In the trees produced by the parser, the | ||
Nsubs | 2 | Represents a variable (or expression) formed by subscripting an array. Child[0] represents an expression that desribes the array. Child[1] points to an expression sub-tree for the subscript expression. | ||
Nnew | 1 | Represents an object construction.
Child[0] will point to an Nident node for the name of the
class describing the object to be created or to an Narraydim
node if an array is being constructed.
| ||
Ntypecast | 2 | Represents a type cast expression.
Child[0] will point to an Nident node for the name of the
target class.
Child[1] points to an expression sub-tree describing the object
to be cast.
| ||
Null | 0 | Represent an occurrence of the expression null. | ||
Nthis | 0 | Represent an occurrence of the expression this. | ||
NFramePtr | 0 | Represent a reference to the frame of the current method. | ||
unaries | 1 | The node labels Nnot, Nneg and Nlength are used to represent
expressions formed using the logical not operator
(! ), the arithmetic negation operator
(unary - ), and the array length operator. Child[0] points to a sub-tree
representing the expression to whose value the
operator should be applied.
| ||
binaries | 2 | The node labels Nor, Nand, Nlt, Ngt, Neq, Nle, Nge, Nne, Nplus, Nminus, Ntimes, and Ndiv are used to represent expressions formed using the logical, relational and arithmetic binary operators. The sub-expressions to whose values the operator should be applied are pointed to by child[0] and child[1]. | ||
Nerror | 0 | Inserted in tree at points where an error was detected in the syntax of a phrase. Actually, the only place that such nodes ever appear is as elements of "lists". So, the only place you need to check for them is when processing statement lists, parameter lists, etc. |
Now, to complete the discussion of our scheme for representing Woolite programs, we must discuss more details of the types used in the symbol table. As explained in the overview presented above, the symbol table is composed of identifier descriptors, binding descriptors, and declaration descriptors.
Identifier
descriptors are actually quite simple. There is one such descriptor
for each distinct identifier used in the program.3 A C language
structure specification for the type identdesc
used to store
identifier descriptors is shown in figure *.
typedef struct iddesc {
char *name; /* The characters string form of the identifer */
struct iddesc *hashlink; /* Link for hash chains used by scanner */
struct scopedBinding *bindStack; /* Head pointer for stack of bindings of */
/* this identifier in currently open scopes. */
} identdesc;
The name
field is just a pointer to the characters that form
the identifier. The hashlink
field is used to maintain lists of
identifier with the same hash value when building the hash table used
by the scanner. It will not be of concern to you when doing semantic
processing. The bindStack
component is to be used as a pointer
to the top of the stack of bindings to declarations of
the identifier found in scopes that are still open. The scanner
initializes this field to NULL.
The relationship between an identifier and a declaration for that identifier is represented by a scopedBinding or methodBinding structure. scopedBindings are used for bindings associated with nested scope rules. methodBindings are used to represent the associations between a method name and its class and the declaration of the method.
The specification for the type scopedBinding is shown in Figure *.
/* Structure used to keep track of the bindings between identifiers and */
/* individual declarations that are active within a given scope and the */
/* bindings (if any) that they are masking according to nested scope */
/* rules. */
typedef struct scopedBinding {
decldesc * descr; /* The bound declaration */
int level; /* The nesting level of this binding */
struct scopedBinding * bindStackNext; /* Outer binding hid by this binding or null */
struct scopedBinding * scopeNext; /* Link for list of all this scope's bindings */
} scopedBinding;
The descr field refers to the declaration associated with the identifier through this binding. The level field records the level of the scope in which this association was made. Note that this level can be different from the level in which the declaration itself occurred in the case of a method inherited from a class declared at a different nesting level. The bindStackNext field is used to maintain a stack of all active declarations of the identifier associated with this binding. The scopeNext field is used to maintain a list of all the bindings made in a given scope. Your code is responsible for creating scopedBinding structures and maintaining these stacks and lists.
Structures of type scopedesc should be used to keep track of all of the currently open scopes and the bindings made within them. The declaration for this type is shown below.
/* Structure used to keep track of open scopes and /* names/declaration pairs bound within them. */ typedef struct scopedesc { struct scopedesc *container; /* Descriptor for surrounding scope (or null) */ scopedBinding * bindingList; /* Header for list of this scope's bindings */ } scopedesc;The bindingList field should be used as a head pointer for a list of all the bindings made in a given scope. The members of this list will be chained together through the scopeNext links found in scopedBinding structures. The container field will hold a pointer to the descriptor for the surrounding scope (or NULL if this descriptor is for the outermost class).
The specification for the type methodBinding is shown in Figure *.
/* Structure used to link together METHOD-NAMExCLASS pairs that hash to the */
/* same bucket in the hash table used to resolve qualified references to */
/* method names. */
typedef struct methodBinding {
decldesc * method; /* The descriptor for the method */
decldesc * class; /* The class that is associated with this binding */
/* (must be a subclass of the class that contains */
/* the associated method declaration). */
struct methodBinding *next; /* Next entry off the hash bucket */
} methodBinding;
The method field points to the declaration descriptor associated with a given method name when that name is used to invoke a method on an object of the type described by the declaration descriptor pointed to by class. The next field should be used to implement a hash table to locate such bindings.
Declaration descriptors are more complex than identifier descriptors or bindings. Depending on the type of declaration involved ( a class definition, a method definition, a variable declaration, etc.) different structures must be used. Accordingly, as with tree nodes, the type used to describe declaration descriptors is a union type. The C declaration for this union type is shown in figure *.
/* This union describes the type of all declaration descriptors */
typedef union dcldesc {
struct unkdesc unk;
struct methoddesc method;
struct classdesc class;
struct vardesc var;
} decldesc;
While many distinct structure
types are used as declaration descriptors they share several
common fields. The declarations of these common fields
are grouped in a #define
named COMMONFIELDS
.
This #define
is
used to include the fields in each of the distinct structure
type definitions. As in the syntax tree definitions, all declaration
descriptors contain a common type
field used to determine the
actual format of a member of the union type. The value of this type
field will be an element of the enumerated type decltype
.
Also, a structure type
unkdesc
is provided to allow one to reference the common fields
of a declaration descriptor before the actual type of the descriptor
involved can be determined. The declarations of COMMONFIELDS
,
decltype
and unkdesc
are shown in figure *.
/* Enumeration type used to label the various type of declaration */
/* descriptors that can occur in the symbol table. */
typedef enum {
methoddecl, /* method declarations */
instvardecl, /* instance variables */
locvardecl, /* local variables */
formaldecl, /* Formal parameter names */
classdecl, /* class names (also used for int) */
} decltype;
/* All declaration descriptors contain the following components */
/* (although structure component descriptors don't use them all.) */
#define COMMONFIELDS \
decltype type; /* Type of this declaration descriptor */ \
identdesc *ident; /* Pointer to associated ident. descriptor */ \
int line; /* Line number at which declaration occurred. */ \
int deflevel; /* nesting level of this declaration */ \
union dcldesc * memberlink; /* link used to form lists of class */ \
/* methods, vars, and method locals and formal */
/* Generic structure used to access common fields of decl. descriptors. */
struct unkdesc {
COMMONFIELDS
};
The first of the common fields is the type
field which holds an
element of the enumeration type decltype
.
The field ident
is used by all
declaration descriptors to hold a pointer back to the identifier
descriptor for the identifier associated with the declaration. The
line
component is used to hold the line number on which the declaration
occurred. The deflevel is used to record the nesting level at which this
declaration occurs.
The last component in COMMONFIELDS
is memberlink
. During
declaration processing, this field is used as the "next" pointer
for various linked lists that are used to form list of declarations that
occur within a given class or method.
Within the descriptors for variables and methods, it is necessary to represent information about types including return types, variable types and parameter types. Structures of type typedesc are used to do this. The specification of this type is shown in Figure *.
typedef struct typedesc {
union dcldesc * base; /* pointer to base type descriptor */
int dimensionality; /* number of dimensions of array or 0 */
} typedesc;
In Woolite, A type is either a class, int or a (possibly multi-dimensional) array of some class or int. A type descriptor therefore stores a pointer to the declaration of the base type, base, and a count of the number of dimensions, dimensionality. A dimensionality of 0 means the type is not an array at all.
All variable and formal parameter declarations are described using the var member of the decldesc union type. The value of the type field in such a declaration descriptor is used to distinguish the type of variable declared. The possible type values are instvardecl, localvardecl, and formaldecl. The declaration of the type used for these descriptors is shown in Figure *.
/* Structure used for instance variable, local variable, and formal */
/* parameter declaration descriptors. */
struct vardesc {
COMMONFIELDS
int varPosition; /* position within containing class or method*/
typedesc * mytype; /* decl. descriptor for the variable's type */
union dcldesc *owner; /* Desc. of class or method containing decl */
};
There are three fields included in the declaration descriptors of variables,
formals and structure components. The first is varPosition which should
be set equal to the position of this variable within the list of all similar
variables present in its scope. That is, the positions of instance variables,
formal parameters, and local variables should be calculated independently
from one another. For instance variables, the position should include not
just variables declared in the same class but also any variables declared
in superclasses.
The second field is mytype
which should
be set equal to a pointer to a type descriptor for
the variable or parameter type. The third is owner
which
should point to the declaration descriptor for the class or method in which
the name was declared.
Method definitions are described using the method member of the decldesc union. The value in the type field of such a descriptor will be methoddecl. The declaration of the type used for these descriptors is shown in Figure *.
/* Structure used for method declaration descriptors. */
struct methoddesc {
COMMONFIELDS
int methodPosition; /* Method's postion in method table */
union dcldesc * container; /* Descriptor for class containing method */
int localCount; /* Size of space required for locals */
union dcldesc *locals; /* List of all local variables */
int paramCount; /* Count of parameters method expects */
union dcldesc * formallist; /* head of list of this method's formals. */
typedesc * rtntype; /* Return type (base == NULL if void) */
CODELBL * entrylbl; /* Label placed on first line of method */
char overrides; /* TRUE if method overrides another */
};
The container field refers to the class in which the method was defined. The field localCount and locals are used to keep track of the count of and declaration descriptors for all local variable defined within the method. Similarly, paramCount and formallist are used to keep track of formal parameter declaration descriptors for the method. The rtntype field should be set to point to a type descriptor for the method's return type or to NULL if the method is declared void. The entrylbl field will be used in later phases. The overrides field is a boolean that should record whether the method overrides a method in some superclass.
Class declarations are represented using the class member of the decldesc union type. The value in the type field of such descriptors should be classdecl. The declaration of the type used for these descriptors is shown in Figure *.
/* Structure used for class declaration descriptors. */
struct classdesc {
COMMONFIELDS
union dcldesc *container; /* Descriptor for surrounding class (or null) */
union dcldesc *super; /* Descriptor for superclass (or null) */
int instVarCount; /* Count of number of instance variables */
union dcldesc *vars; /* Header for list of this class' instance variables */
int methodCount; /* Count of number of methods (including inherited) */
union dcldesc *methods; /* Header for list of this class' methods */
/* (not including inherited methods) */
union dcldesc *classes; /* Header for list of nested class decls */
CODELBL * methodtab; /* label for table of method addresses */
char methodsResolved; /* Boolean set after processing method headers */
};
The container field points to the declaration descriptor for the class in which this class declaration was lexically nested (if any). The super field points to the class this class extends (if any). The fields instVarCount and vars are used to keep track of all instance variables declared within the class. Similarly, methodCount and methods are used to keep track of method declarations found within the class. The instVarCount and methodCount variables should include both locally declared names and variables and methods included in superclasses. The vars and methods lists, however, should only include names that are declared explicitly in the associated class. The classes field is the head pointer for a list of the declarations descriptors of any nested class declarations. The methodtab field will be used by later phases. The methodsResolved field holds a boolean that is set to true as soon as bindings for the class' method headers have been created. It can then be used to enforce the rule against forward references in extends clauses.