Working with the Woolite Parser
The directory ~tom/shared/434/woolite
contains the
code which you will need to use my syntactic analyzer.
Within this directory, you will find sub-directories for each of
the major assigned phases of the project (phase1, phase2, and phase3).
Each of these directories will contain versions of my parser specialized
to the corresponding phase of the project. Within each of the phase
subdirectories, I will store
files named syntree.h
, syntax.h
and symtab.h
which contains type
definitions describing the structure of the syntax trees and symbol
table entries produced by the
Woolite syntax analyzer I have provided. Such files are called "header"
files (hence the use of the ".h" suffix).
These directories will also contain
object files (".o" files) containing the executable code for the
parser, scanner, and symbol table routines.
Within the directory ~tom/shared/434/woolite
you will also
find a sub-directory named startup
.
The most important file in the startup
subdirectory is named
Makefile
. There is also a short source file named main.c
.
You should copy both of these files from
~tom/shared/434/woolite/startup
into the directory in which
you plan to create the files needed to complete the phase. While you may
have the urge to copy other files from ~tom/shared/434/woolite
, the files
Makefile
and main.c
are the only files you should
copy.
The Makefile
is intended for use as input to the Unix "make" utility.
In case you are unfamiliar with "make" it is a utility which takes a file
describing how to build an object program from one or more source files
and performs needed compilation steps to build the object program when
invoked. For example, after you have copied Makefile
and main.c
into one of your directories, make that directory your current working
directory and then type
make
main.c
file and the object files provided in
~tom/shared/434/woolite
.
The directory ~tom/shared/434/woolite/samples
contains some sample Woolite programs. So, you can test the
parser produced by typing
woolite ~tom/shared/434/woolite/samples/allsyntax.c
In the remainder of this handout, I will attempt to tell you all you
need to know to work with Makefile
, at least for this phase of
the project. In addition, those of you unfamiliar with make
may wish to read the document
Pmake -- A Tutorial
.
Makefile
has been designed to enable you to easily combine your
code with my code through all the phases of the project. The file starts
with about 20 lines describing variables you can (and must) set to
customize the file to your compiler and parameters (targets) you can
specify when invoking make
to alter the way in which it interprets the
contents of Makefile
.
The most important variables at this point are HDR
and SRC
.
You will find their definitions
shortly after the comment lines. The definitions look like:
HDR = SRC = main.c
SRC
should be a list of the names of the source
files for the compiler kept in your directory. Initially, the main.c
file you copied from ~tom/shared/434/woolite
will be the only
such file. When you create other source files, however, you should add their
names. For example, if you create a file named resolve.c
to hold
the code for this phase, you should edit Makefile
changing the
definition of SRC
to
SRC = main.c resolve.c
The variable HDR
should be set equal to a list of the names of all
the header files kept within your directory. The header files
I have provided for you should not be included here.
These files will
be accessed from my directory because of the setting of the HDRs variable.
Two other variable definitions you will need to change eventually are
those for PHASE
and SUBPHASE
. The PHASE variable determines
which subdirectory of the pub
directory will be used to
access my header files and object files. You will need to change it
as you move on to later phases. The main function of the SUBPHASE
variable is to enable my scripts to place your finished product in the
right container when you submit it for grading. If you forget to change
it before you submit subphase 2 or 3 of a particular phase of the project
you may overwrite your earlier submission.
When you simply want make
to compile a new version of your
compiler, you will invoke it as shown above by simply typing its name.
The make
program can also be used to perform several other useful
functions by invoking it with one of the parameters described below.
Make's operation depends on having, within Makefile
, a collection of
lines specifying how each object file needed to create an executable
version of your compiler depends on the source and header files
you and I construct. Whenever you create a new source file or add a
#include
to a file, this information changes. Typing
make depend
make
to read through all of your source files and
edit Makefile
to update the collection of dependency specifications
as appropriate. Do not forget to run this command after making such
changes to your source files.
To simplify the task of keeping current listings of your code, I
have included two definitions for targets named list
and
listall
in Makefile
. It you type
make listall
Makefile
)
will be sent to the laserwriter. If you type
make list
make list
or make listing
will be produced.
To provide a means to quickly and reliably distribute information (like
corrections to errors in handouts) to you, I will maintain a file
name PROJECTNEWS
in the ~tom/shared/434/woolite
directory. When you execute the
simple command make
, the system will check to see if this file
has changed since you last read its contents. If it has, it will
inform you that you should read it. To read the file, type
make meread
make
uses to tell when you last read the file.
At several points during the semester, I will ask that you submit your source files electronically. To do this, simply type
make submit
In addition to providing a parser and a scanner, I have included three
procedures in the code I have placed in ~tom/shared/434/woolite
that you
may find very helpful when writing code to process the syntax tree and
the symbol table.
The first is the printree
procedure which you may have already
noticed is called from within the main program I have provided.
printree
expects two parameters. The first should be a pointer
to a node in the syntax tree. The routine will output a somewhat
readable display of the subtree to which its first parameter points.
The second parameter is an integer that specifies how much the output
produced by the procedure should be indented. You will generally
specify 1 or 0 for this parameter. The printree
routine,
however, uses other values when it calls itself recursively to ensure
that subtrees are indented in the output.
printree
may behave quite oddly if asked to print trees that
are in some way damaged. For example, if you set a child pointer to
a value that is not really a pointer you are likely to see your program
crash in printree. I have tried to make printree fairly tolerant of
NULL pointers in unexpected places. So, you may protect yourself
somewhat by setting pointer fields in syntax tree nodes and symbol
table nodes to NULL rather than leaving them uninitialized when no
other value is appropriate.
The second routine provided is named visitlist
. Within the
syntax tree there are several list structures (sub-trees built
using Nstmtlist nodes for example). When processing these lists
you will usually want to apply certain operations to all sub-trees
of the list. visitlist
provides a way to do this without
writing the same (fairly simple but boring) loop over and over again.
The function declaration for visitlist
is shown below:
/* Visitlist - Apply the function 'action' to each of the nodes */ /* in some subtree that forms a list. Skipping error nodes */ int visitlist(node *listhead, int (*action)(), int param)The
listhead
parameter should be a pointer to a tree node
that is a list header (i.e. of type "N...list
"). The
action
parameter should be the name of a function
that performs the action you want to have applied to each
element of the list. For example, if you write a routine:
processStmt( node *stmt ) { ... }which processes one statement and the variable "
elsePart
"
held a pointer to the list of statements found in the else part
of an if statement, a call of the
form:
visitlist( elsePart, processStmt, ...)would invoke
processcomponent
on each statement subtree in the
list elsePart
.
I have slightly simplified the preceding discussion by not
explaining the third parameter to visitlist, param
. Sometimes,
the routine you want applied to every element of a list requires
some other parameter each time it is called. For example, when
processing statements, the processing routine
needs access to the declaration descriptor of the containing method
so that it can process return statement correctly..
The param
argument provides a way to pass such information
to the routine invoked by visitlist
.
When visitlist
invokes the routine passed as action
, it
actually passes two parameters: a pointer to a sub-tree which is an
element of the list being processed and whatever value was passed
to it as param
.
Thus, in the preceding example, if we wanted to provide a pointer
to the declaration descriptor for the containing class
to processStmt
the declaration
of processStmt
could be changed to:
processcomponent( node *formaldecl, decldesc *containingMethod ) { ... }and, assuming that
curMethod
is a pointer to the structure
type whose components are being processed, the complete call to
visitlist would look like:
visitlist( components, processcomponent, curMethod)
In cases where there is no need to pass such an extra parameter
to the routine invoked by visitlist
, you can just pass
the value "0" and declare a dummy parameter for your action
routine.
As an extra little feature, I should mention that
visitlist
is smart enough to skip Nerror
nodes
for you.
Finally, To let me (and you) know that you have succeeded in
building complete declaration descriptors, your code for
phase 1 should produce a readable "dump" of the symbol table
declaration descriptors it creates. To make this easy,
I have encorporated a routine named DumpDecldescs
in
the .o
files provided to
you. If you call this routine with a pointer to your syntax tree
after resolving all identifiers, it will print out the contents of
each declaration descriptor. Like printree
this routine will
only work correctly if the fields of the syntax tree and symbol table
entries are valid.