Woolite Compiler Implementation Project
Phase 2.3: Code Generation for Methods
Due: April. 28, 2002
To make it easy to process Woolite programs with your compiler, I will
provide a short shell script named wool
(along with lots of other
new odds and ends in the shared/434/woolite/phase2.3
sub-directory). This
script assumes your executable is named woolite
(as it will be
unless you have changed the Makefile I provided). The wool
script will expect the name of a Woolite source file as input. To make
things look right, the source file's name should end with a .w
suffix. The script will run the .w
file through your compiler
and then take what your compiler wrote to standard output and provide
it as input to the 34000 assembler. To make it possible to use
#include
directives in the assembly code you output (I'll explain
why you will need this ability later), the
wool
script will run your compiler's output through the C
pre-processor, cpp
, before sending it to the assembler.
The "final" output of this process will be a tmem
file, which
will be read as input by the wc34000
interpreter program.
In addition, the script will leave the actual
output of your compiler in a file whose name is obtained from the name
of the input file by replacing the .w
suffix with a .s
suffix. Similarly, the output listing produced by the assembler will
be stored in a file ending with a .l
suffix (this file is actually
more useful to read than the .s
file because it shows in which word of
memory each line of code is stored).
To enable you to keep your output code separate from error messages and diagnostics, I have written my code so that all output produced by printree and DumpDecldescs is directed to "stderr". In addition, in case you want to keep the output that goes to standard output and standard error together, my routines start each line of output they produce with a ";". This will cause the assembler to treat such lines as comments.
Execution will begin with the first line of code in the assembler file you produce. When this line is executed, the stack pointer will be set but no register will be pre-loaded with the address of the global variable area. To make it possible for you to load this value, the assembler puts the address of the first unused word in memory (the word after the last instruction in your code) in word 1 of memory. Thus, the instruction
MOVE 1,freePtris probably the first line of code your compiler should output on any input program.
After this instruction, you should generate code to create an object of the main class and to invoke the "main" method of this class. After the JSR to the main method, place a HALT instruction. Then, generate code for all the methods of all the classes defined in the program.
To enable you to generate correct code for invocations, I have included
fields designed to refer to code labels in the declaration descriptor formats used for
methods and classes. These fields are named entrylbl
and methodtab
.
They are intended
to hold the "code labels" placed on the first line generated for a method
or for the method jump table for a class.
The type of the entrylbl
and methodtab
fields are determined by the value of
the #define CODELBL
. If you do not include a #define
for CODELBL
, it defaults to the type char
.
However, if you have declared a special type to hold code labels,
you can #define
CODELBL
to be the name of that type.
For example, if your type's name was
codelabel
, you would
simply include the define
#define CODELBL codelabelin your .c files.
In addition, I have added "code label" as one of the members of the union type
used to represent operand descriptors. This will make it easier to implement
function in your low-level code generator to output instructions that access
information using code labels. This means, however, that the opdesc.h file
now also depends on the #define
for CODELBL
. As a result,
this #define
should probably be one of the first lines in your .c files.
While I have included the entrylbl
and methodtab
fields in the
declaration descriptor types, it is up to your code to set these fields.
To make your compiler useful, you must provide a standard set of input/output methods. These methods should be named putNum, getNum, outChar, and getChar. They should provide a way to execute the corresponding 34000 instructions from a Woolite program. The methods putNum and putChar will expect one value parameter (of type int). The methods getNum and getChar will take no parameters and return int values. All four of these methods will be associated with a built-in class name IOlib. To use these methods, a programmer will either create a new object of the IOlib type or define one or more classes within a program that extend IOlib. Within any class that extends IOlib, the names putNum, getNum, outChar, and getChar can be used to perform simple input/output operations.
To make it easy for you to add support for this input/output "library" to
your compiler, I have done two things: 1) I have included code in the
init_symbtab
routine which creates declaration descriptors
for the four I/O methods and for the IOlib class and then creates a binding
to place the name IOlib in scope; and 2) I have provided
you with a file of 34000 assembly language code named iolib.h
that contains the actual assembly language code for these procedures.
Since you have defined your own types to represent code labels, I could not include code in init_symtab to associate code labels with the entrylbl and methodtab fields found in the declaration descriptors for the IOlib class or the four I/O methods. You will have to include code in your compiler to initialize these fields. You can use the "lookup" function to access the declaration descriptor for the IOlib class.
The iolib.h
file is not a C header file. It is a file of
assembly language code to be included with the code you generate.
Since wool
runs the assembly code you produce through the
C pre-processor, you can use this file by including the line
#include "iolib.h"
Like the code I gave you in ~tom/shared/434/woolite/phase2/stmtgen.c
you
may find that you have to modify the code in my
iolib.h
file
before you can use it. In particular, since your code for handling code labels
may insist on sticking a few digits at the end of each label used in your
assembly code, you may have to change the names on the first lines of the
input/output method code to include such digits. As a result, I will expect all of you to
submit copies of the actual version of this file you use. To make
sure that this happens, you must include iolib.h
in the HDR
line of your Makefile.
I still have not worked out the details required to ensure your compiler will support source level debugging, but I did not want to delay the distribution of this handout any longer. As a result, that topic will be discussed in an addendum to this handout.