CS 334
Programming Languages
Spring 2002

Lecture 8

Binding Time

Binding time is the time when decisions are made as to the meaning of certain constructs in programming languages. Watch for the binding time of constructs in the following language features.

Scope

Scope of a variable is the range of program instructions over which the variable is known. The scope of a variable may be either static or dynamic.

Static

Most languages use static scoping (e.g., Pascal, Modula-2, C, ..)

Scope is associated with the static text of program.

Can determine scope by looking at structure of program rather than execution path telling how got there.

May have holes in scope of variable

program ...
    var M : integer;
    ....
    procedure A ...
        var M : array [1..10] of real;
        begin
            ...
        end;
begin
    ...
end.
Variable M declared in main program is not visible in procedure A, since new declaration of M "shades" old declaration.

Symbol table keeps track of which declarations are currently visible.

Think of symbol table as stack. When enter a new scope, new declarations are pushed on and can shade old ones. When exit scope, declarations arising from the scope are popped off.

Dynamic scoping:

Scope is associated with the execution path of program.

In particular an occurrence of an identifier in a procedure may be associated with different variables at different times in the execution of the program.

Example:

program ...
    var A : integer;

    procedure Y(B: integer);
        begin
            ...; 
            B := A + B; 
            ...
        end; {Y}

    procedure Z(...);
        var A: integer;
        begin
            ...; 
            Y(...); 
            ...
        end; {Z}

    begin {main}
        ...; 
        Z(...);
        ...
    end.
Question: Which variable with name A is used when Y is called from Z? With dynamic scoping, symbol table built and maintained at run-time.

Push and pop entries when enter and exit scopes at run-time.

For obvious reasons, dynamic scoping usually implies dynamic typing!

LISP and APL use dynamic scoping (though SCHEME has default of static)

Lifetime

In FORTRAN, variables are allocated statically. All variables are allocated storage before execution of program begins. As consequence, when return to a procedure local variables still have last value left at end of previous invocation. Not so in Pascal, C, or Java.

In Pascal, C, or Java, when enter procedure any local variables are allocated and are then deallocated when exit. (Called dynamic allocation of variables).

In block-structured language (e.g., Pascal, C, Modula-2, Java, etc.):

Use run-time stack to allocate space for local variables and parameters when enter a new unit (procedure, functions, etc.). Space is called an activation record.

Pop it off run-time stack when exit unit.

Note that a procedure may have several activation records on stack if called recursively.

Even without recursion may have several distinct variables on stack with same name!

When pointers are used, utilize another kind of memory, called "heap".

When do "new(p)" operation where p is of type pointer to T, sufficient memory is allocated from the heap to hold a value of type p and p is assigned the location of that memory. The value is accessed by writing p^ in Pascal or *p in C. In Java, when a variable refers to an object, it actually holds a pointer to the value - with no dereferencing needed to get access to the object

This memory does not follow the stack discipline. The lifetime of the heap-allocated memory is determined manually by "new" and "dispose" (or malloc) commands. Entering or exiting a scope has no impact on the allocation or deallocation of this memory.

Therefore in Pascal, C, and Java, for example, there are three kinds of memory:

In ML, everything comes off of heap. But automatically allocated when needed and deallocated (by garbage collector) when no way of accessing it. Java is similar.

More implementation details later.

Value

Usually think of value as bound at execution time, but can vary. If bound at language definition time (e.g., maxint, true, false), then called language defined constant.

If freeze at compilation, then program constant

    const size = 100;
          doubleSize = 2 * size {called manifest constant}
Static final constants in Java are frozen at compilation time.

In Java may have values of identifiers frozen at declaration time. For example in the body of a method:

    void m (int n ) {
        final int x = 3 * n - 2;   // value bound & frozen on procedure entry
        ...
    }

Postpone discussion of binding time for types until later.

Aliases

Two expressions are said to be aliases if they denote the same location:

This can occur especially easily when using var parameters in Pascal or reference parameters in C++. It is the default when making assignments of objects in Java.

Suppose class C has method inc that increases the value of an instance variable b by 1. Suppose we have

    C x = new C();
    C y;
    y = x;
    y.inc();
    x.inc();
    
After the execution of this code, the value of instance variable b in x has been increased by 2, rather than the one that it might otherwise appear.

When operations on one expression result in changes to the value of an expression not mentioned, then we say the changes are side effects. Sometimes this is desired, while at others it can be the source of great confusion when things go wrong.

Obviously, it is important that programmers be aware of this. I have seen problems occur when have method m(x,y) where something different is done to each of a and b, and then a user accidentally writes m(z,z) without realizing that z will be changed in both ways!

Because all assignments of objects in Java result in aliases, Java provides every object with a clone method that, when overridden appropriately, can provide a disjoint copy of an object. This is often extremely useful when an object is inserted into a data structure to prevent other references from changing features -- e.g. priorities in a priority queue -- after the object has been inserted.

The ultimate in bad manners: Pointers

Recognized as major cause of run-time errors. (This is one reason why there are no explicit pointers used in Java.)

"Pointers have been lumped with the goto statement as a marvelous way to create impossible to understand programs."

Kernighan & Ritchie, The C Programming Language

(In fairness, they then go on to defend the use of pointers.)

Problems:

  1. Dangling pointers:

    1. If pointers can point to object on run-time stack (named variable - PL/I, C), then object may go away before pointer.

    2. User may explicitly deallocate pointer even if other variables still point to same object. Possible solutions involve reference counting or garbage collection.

  2. Dereferencing uninitialized or nil pointers may cause crashes.

  3. Garbage: Unreachable items may clog heap memory & can't recycle. Garbage collection or reference counting may solve.

  4. Holes in typing system (e.g., in C) may allow arbitrary integers to be used as pointers (through variant records in Pascal)

Notice that Java avoids many of these problems by not allowing explicit access to pointers, not allowing the user to create a pointer to a location on the stack (e.g., a local variable that is an int), and by using garbage collection. This makes it much more secure than C or C++!

Pointer arithmetic norm in C

TYPES

Support abstractions of set of elements and operations on them. Both are essential: a type without operations is pretty useless!

Built-in types:

  1. Increase readability

  2. Hide representation

  3. Allow type-checking at compile and/or run-time for earlier detection of errors.

  4. Help disambiguate operators

  5. Allow expression of constraints on accuracy of representation.

  6. Help ensure different components in separately compiled units will interoperate properly.

Aggregates

Also come with built-in operations.

Cartesian products:

S x T = {<s,t> | s in S , t in T}.
Can also write as PRODi in I Si = S1 x S2 x ... x Sn. If all are the same, write Sn.

Tuples of ML: type point = int * int

What if have So? Called unit in ML.

Records (COBOL, Pascal, Ada) or Structures (PL/I, C, and ALGOL 68).

Heterogeneous collections of data.

Differ from Cartesian product since fields associated with labels

E.g.

    record                   record
       x : integer;    /=       a : integer;
       y : real                 b : real
    end;                     end

Operations and relations: selection ".", =, ==.

Can use generalized product notation: PRODl in Lab T(l)

Ex. in first example above, Lab = {x,y}, T(x) = integer, T(y) = real.

Records and structs are not included in Java. We can think of them as exceptionally uninteresting objects that simply have setters and getters for all of their instance variables.

Disjoint Union:

Variant record - type1 union type2 w/discriminant

Support alternatives w/in type:

Ex.

        RECORD
           name : string;
           CASE status : (student, faculty) OF
              student: gpa : real;
                       class : INTEGER;
           |  faculty: rank : (Assis, Assoc, Prof);
           END;
        END;

Save space yet (hopefully) provide type security. Saves space because the amount of space reserved for a variable of this type is the larger of the variants.

Fails in Pascal / MODULA-2 since variants not protected.

How is this supported in ML?

datatype IntReal = INTEGER of int | REAL of real;
Can think of enumerated types in Pascal, C, or Ada as variant w/ only tags!

NOTICE: Type safe. Clu and Ada also support type-safe case for variants:

Ada: Variants - declared as parameterized records:

type geometric (Kind: (Triangle, Square) := Square) is
    record
       color : ColorType := Red ;
       case Kind of
          when Triangle =>
                 pt1,pt2,pt3:Point;
          when Square =>
                 upperleft : Point;
                 length : INTEGER range 1..100;
       end case;
    end record;

ob1 : geometric -- default is Square
ob2 : geometric(Triangle) -- frozen, can't be changed
Avoids Pascal's problems w/holes in typing.

Illegal to change "discriminant" alone.

ob1 := ob2   -- OK
ob2 := ob1   -- generate run-time check to ensure Triangle
If want to change discriminant, must assign values to all components of record:
ob1 := (Color=>Red,Kind=>Triangle,pt1=>a,pt2=>b,pt3=>c);

If write code

    ... ob1.length...
then converted to run-time check:
    if ob1.Kind = Square then ... ob1.length ....
                         else raise constraint_error
    end if.

Fixes type insecurity of Pascal

Note disjoint union is not same as set-theoretic union, since have tags.

    IntReal = {INTEGER} x int + {REAL} x real

C supports undiscriminated unions:

    typedef union {int i; float r;} utype.
As usual with C, it is presumed that the programmer knows what he/she is doing and no static or run-time checking is performed.

Union types (discriminated or not) are not supported in pure object-oriented languages. Subtypes can often be used to play the same role.

E.g., in Java, can have an interface with many classes implementing the interface. Variables with the interface type can hold values from any of these classes. No type insecurities are possible because can only access those features listed in the interface, and hence contained in all of the classes.


Back to:

  • CS 334 home page
  • Kim Bruce's home page
  • CS Department home page
  • kim@cs.williams.edu