Flow graphs and Basic Blocks

Flow graphs and Basic Blocks

The value numbering scheme does not work in situations where there are loops or conditionals:
```
x[i+1] = y;
while ( i < max ) {
   x [ i+1 ] = x[ i+1 ] + z;
   i = i+1;
}
```
In this example, we would assign the same value number to all four instances of i+1, but the assignment statement at the end of the loop means that the instance of i+1 outside the loop will not have the same value as those inside in all cases.
```
w = 2*x + 1;
if ( x > 0 )
   { z = 2*x }
y = z + 1;
```
In this example, 2*x + 1 and z + 1 might be common subexpressions, but we can't be sure unless we know whether or not x will be positive.
Examples like this make this notion of "straight line code" important enough to deserve a name. We will call sequences of staight line code "basic blocks".
In the real world, before optimization, a compiler usually would rewrite the program into a form in which basic blocks and the flow graph were represented explicitly.
- Most optimizers work on an intermediate form that is quite a bit closer to assembly language than our syntax trees. In such code a basic block block is simply a sequence of statements beginning with a label that contains no branches (other than subroutine calls) or other labels.
- To preserve control flow information, such compilers build a directed graph whose nodes are basic blocks and whose edges represent possible branches between blocks. This is called a control flow graph.
Describing a basic block in our intermediate form is a bit trickier since trees aren't quite as linear as pseudo-assembly language.
Luckly, since we are only interested in using basic blocks to identify sequences of straight line code we can take a simpler approach.
The approach I want you to imagine depends on two facts:
- Local optimization algorithms have a fairly simple structure:
```
for all basic blocks do
    initialize various data structures
    for each instruction in the block do
          scan the instruction and 
          optimize (if possible).
	
```
- Most of the algorithms you have already implemented (semantic processing, code generation), process syntax tree nodes in such a way that all the nodes that belong to one basic block are processed consecutively.
All we need to do to apply a local optimization algorithm to basic blocks is take the code used to do a "standard" traversal of the syntax tree and figure out where to put the "initialize various data structures" steps.

To make this precise, here are sketches of some pieces of the optimization traversal code:

void optimizeStmtList( node * slist ) {
   visitlist( slist, optimizeStmt, 0);
}

void optimizeStmt( node * stmt ) {
   switch (stmt->internal.type ) {
   case Nif:
      optimizeIf( stmt );
      break;
   case Nwhile:
   ...
}

void optimizeIf( node * stmt ) {
    optimizeExpr( stmt->internal.child[0] );
    startNewBlock();
    optimizeStmtList( stmt->internal.child[1] );	
    startNewBlock();
    if ( there is an else part ) {
       optimizeStmtList( stmt->internal.child[2]);
       startNewBlock();
    }
}

We can also take advantage of the fact that our goal is local optimization by working with a slightly looser definition of basic blocks.
- At a point where control branches we can continue to propagate CSE information down one of the branches (or both if we are willing to save the state of the algorithm when we head down the first branch). We just have to start over again whenever two control paths joing.
- This leads to the notion of an "extended basic block".
  - In low-level (assembly language like) intermediate forms, a extended basic block is a sequence of statements starting with a label (or the entry point of a procedure) that includes no other labels (but may contain branches unlike simple basic blocks).
  - Note that an extended basic block will be the union of a sequence of basic blocks.
  - In our trees, extended basic blocks can be formed by leaving out the "startNewBlock" except at points where we know labels may be placed.

Computer Science 434
Department of Computer Science
Williams College

Flow graphs and Basic Blocks