Woolite Compiler Implementation Project
Phase 1 : Semantic Processing
Phase 1.1: due 2/21
Phase 1.2: due2/28

As a first step toward constructing a complete Woolite compiler, I would like you to implement the semantic processing required for Woolite programs. As you know, semantic processing occurs after syntactic processing. To enable you to work on semantic processing before syntactic processing, I will provide you with object code for a syntactic analyzer for Woolite that produces syntax trees in the form described by the accompanying An Intermediate Form for Woolite Programs handout. The details of using this syntactic analyzer are discussed in yet another handout: Working with the Woolite Parser.

Required Processing

As we have discussed in class, during semantic processing:
  1. declaration descriptors are created,
  2. information about each class, variable, and method is stored in its declaration descriptor.
  3. references to identifier descriptors that had been used to represent identifiers in the abstract syntax tree produced by the syntactic analysis phase are augmented with references to the appropriate declaration descriptors, and
  4. Checking for context-sensitive errors is performed.

In an effort to help you budget your time, I have broken the tasks your semantic analyzer will need to perform in two sub-phases. While writing the code for the first sub-phase, you may assume that:

Obviously, it would be quite hard to write a useful program that satisfies these assumptions, but such program will still exercise quite a bit of code in your compiler.

For the second sub-phase you should add the code required to handle all of the features supported by Woolite.

Error checking

While most of our discussion in class has focused on the issue of correctly associating uses of identifiers with declarations of the identifiers, a good bit of your code will be devoted to verifying the semantic consistency of the program. Once you know how to correctly interpret all the identifier names used in a program, such checking is usually straightforward, but can require quite a bit of code.

Your code should print error messages to the standard error output file. Your error messages should be as informative as possible. Each error message should include the number of the line in the source file on which the error occurred. If appropriate, information such as the name of the identifier involved should be included. In addition to printing a message for each error, you should increment the global variable errorcount each time such an error is printed. A declaration for this variable is included in syntree.h. The value of this variable will be used later to decide whether or not to proceed with code generation.

To guide you in error checking, the following hopefully (but not necessarily) complete list of errors to consider is provided:

When errors are detected, be sure to take actions appropriate to let you later avoid printing spurious error message or, worse yet, crashing. For example, if the name used to specify a variable's type within its declaration is undefined, set the type field in the new variable's descriptor to a known value (NULL?) rather than leaving it uninitialized. Then, in context's where you would access the variable's type to check the correctness of some other construct, treat the "known value" used as a special case to avoid generating a redundant error message.

Sequencing the Traversal of the Syntax Tree

The fact that Woolite supports arbitrarily deep nesting of class definitions and allows forward references to declarations in all contexts except extends clauses makes it necessary to make several partial sub-passes over the syntax tree and to think very carefully about what processing to perform during each pass. In fact, you will have to make two passes over the whole program.

During the first pass, your compiler will ignore all method bodies. The goals of this pass are to create a partial declaration descriptor for each class, each instance variable and each method, and to enter bindings for all the class-method pairs in the hash table used to interpret qualified method invocations.

After this first pass is complete, the declaration descriptors you have created for classes should include pointers to any superclasses and contain pointers to lists of all of the instance variable declarations, method declarations and nested class declarations found within the class. The declaration descriptors you create for the methods in the source program will be complete except that they will not contain information about any local variables declared in the method bodies. In particular, the declaration descriptors you create for a method during the first pass must include a head pointer to a list of declaration descriptors for the method's formal parameters. You won't want to place bindings for these parameter names into the scope lists and stacks at this point, but you need the lists of formals so that you can verify correct formal-to-actual parameter type correspondence during the second pass.

During this first pass, you will have to place bindings for class names and instance varliables in the lists describing each scope and on the stacks of bindings associated with identifier descriptors so that when you examine the parameter and return types found in method headers you can associate the names used with the correct declarations. Since you will not process method bodies, however, you do not need to place method, parameter, or local variable names "in scope" during this pass.

During the second pass, you will examine all the method bodies you ignored on the first pass. Since the hash table used to associate class-method pairs with the correct declaration descriptors will have been completed during the first pass, you will not need to add bindings for methods to that hash table again on the second pass. You will, however, have to add bindings for all names to the scope lists and stacks (again) during this second pass.

Take advantage of the work you did on the first pass to avoid actually traversing sections of the syntax tree on the second pass when possible. For example, during the first pass, you will have traversed the subtree that describes a method's formal parameters and created a linked list of their declaration descriptors. During the second pass, you need to put bindings to these descriptors on the scope lists and stacks. You could traverse the tree again to find pointers to the descriptors, but there will be a pointer to a linked list of the descriptors in the descriptor you created for the method itself. Use this linked list, rather than the syntax tree, to write a loop that adds the needed bindings to the descriptors.


Computer Science 434
Department of Computer Science
Williams College