CS 334
Programming Languages
Spring 2002

Lecture 15

More ADT's

Can also have package where user can manipulate objects of type stack (external representation) like in Modula-2.

Advantage to external rep. is that can define array of stack, etc.

generic
	length : Natural := 100;
	type element is private; -- generic parameters
package stack is
	type stack is private;
	procedure make_empty (S : out stack);
	procedure push (S : in out stack; X : in element);
	procedure pop (S : in out stack; X: out element);
	function empty (S : stack)  return boolean;
	stack_error : exception;

private		
	type stack is record
		space	: array(1..length) of element;
		top	: integer range 0..length := 0;
	end record;
end stack;

package body stack is

	procedure make_empty (S : out stack);
	begin
		S.top := 0;
	end make_empty ;

	procedure push (S : in out stack; X : element) is
	begin
		if full(S)  then 
			raise stack_error;
		else
			S.top := S.top + 1;
			S.space(S.top) := X;
		end if;
	end push;

	procedure pop (S : in out stack; X: out element) is
	begin
		if empty(S) then 
			raise stack_error;
		else
			X := S.space(S.top);
			S.top := S.top - 1;
		end if;
	end pop;

	function empty(S : in out stack) return boolean is
	begin
		return (top = 0);
	end;

	function full(S : in out stack) return boolean is
	begin
		return (S.top = length);
	end;

	end stack;

Biggest problem is exposure of representation of ADT in private part of package spec.

Not available to user, but must recompile if change representation.

Important difference between the two definitions is that in the first, the ADT is accessible only via operations which belong to the object. Can think of the Stack as an "agent". In the second, there is a data type called stack, and the operations act upon the passive data. In the first one must ask the ADT nicely to perform an operation that it knows how to do. In the second, the operation is imposed on the passive data.

Modula-2

Modula (modular language) designed by Wirth in 1975 for programming small real-time control systems (no files, sets, or pointers)

Modula 2 is 1980 revision (influenced by Mesa at Xerox PARC) intended to synthesize systems programming with general purpose language supporting modern software engineering.

(operating system for Lilith microcomputer written in Modula 2)

Minor changes to Pascal which simplify programming (reliability) and improve program readability and efficiency.
Upward extension of Pascal to support large system design projects. (Modules which can be separately compiled and yet are type checked.)
Downward extension to allow machine-level programming and supports coroutines.

Sample program

DEFINITION MODULE stackMod;
IMPORT element FROM elementMod;
	TYPE stack;
	PROCEDURE make_empty (VAR S : stack);
	PROCEDURE push (VAR S : stack; X : element);
	PROCEDURE pop (VAR S : stack; X: element);
	PROCEDURE empty (S : stack): BOOLEAN;

END stackMod.

IMPLEMENTATION MODULE stackMod;
	TYPE stack = POINTER TO RECORD
		space	: array[1..length] of element;
		top	: INTEGER;
	END;

	PROCEDURE make_empty (VAR S : stack);
	BEGIN
		S^.top := 0;
	END make_empty ;

	...
END stackMod;

Opaque types (those declared without definition in Definition module) must be pointers or take no more space than pointers.

Compare and contrast Modula-2 and Ada on supporting abstract data types.

Representations fairly similar - both can import from other units (modules or packages) and export items to other units.
For external representations not much difference.
Private types in Ada vs opaque types in Modula.
Opaque types require Pointer (or other data type of same length or less)
Use of opaque types does not force recompilation of user programs if representation changes - does force recompilation in Ada.
Internal representations of ADT's almost identical.
Ada more flexible via generic routines - can parameterize over types as well as sizes of objects. Can also use this to create new instances of packages - especially helpful for internal representations of data types.

Clu (1974)

Cluster is basis of support for abstract data types.

Provides both packaging and hiding of representation

(cvt used to go back and forth btn external abstract type and internal representation).

May have parameterized clusters where specify needed properties of type paramenter. E.g.,

	sorted_bag = cluster [t : type] is create, insert, ...
			where t has
				lt, equal : proctype (t,t) returns (bool);

Abstraction facilities described in Liskov et al. paper in collection on reserve shelf.

Biggest difference from Ada and Modula-2 is that cluster is a type. Therefore can create multiple copies. Elements of cluster types are held as implicit references.

ML

Abstypes in ML

ADT supported in very straightforward way in ML as abstypes. (We won't cover this material in class, as it is deprecated, but I leave the lecture notes in for your information.)

Provides for encapsulation and information hiding

Example:

abstype intstack = mkstack of (int list)
  with exception stackUnderflow
    val emptyStk = mkstack []
    fun push (e:int) (mkstack(s):intstack) = mkstack(e::s)
    fun pop (mkstack([])) = raise stackUnderflow
      | pop (mkstack(e::s)) = mkstack(s)
    fun top (mkstack([])) = raise stackUnderflow
      | top (mkstack(e::s)) = e
    fun IsEmpty (mkstack([])) = true
      | IsEmpty (mkstack(e::s)) = false
    end;

Generic stacks ADT:

abstype 'a stack = mkstack of ('a list)
  with exception stackUnderflow
    val emptyStk : 'a stack = mkstack []
    fun push (e:'a) (mkstack(s):'a stack) = mkstack(e::s)
    fun pop (mkstack([]):'a stack) = raise stackUnderflow
      | pop (mkstack(e::s):'a stack) = mkstack(s):'a stack
    fun top (mkstack([]):'a stack) = raise stackUnderflow
      | top (mkstack(e::s):'a stack) = e
    fun IsEmpty (mkstack([]):'a stack) = true
      | IsEmpty (mkstack(e::s):'a stack) = false
   end;

Cannot get at representation of stack

Reference to mkstack(l) will generate an error message.

Only access through constants and op's.

Modules in ML

More sophisticated support through modules, which also support separation between interfaces (called signatures in ML) and implementations, called structures.

Signatures collect together declarations describing types and values to be publicly available. A signature corresponding to the same ADT as represented in the abstype above is given by:

    signature INTSTACKSIG =
    sig
      type intstack;
      exception stackUnderflow;
      val emptyStk: intstack;
      val push: int -> intstack -> intstack;
      val pop: intstack -> intstack;
      val top: intstack -> int;
      val IsEmpty: intstack -> bool;
    end;

This signature introduces a type, an exception, and 5 functions. Like Ada and Modula-2, they hide the representation of all of the components (though, like those languages, they can reveal types if desired), providing only the types for values.

Structures are collections of types, functions, exceptions, and values (among others) that are encapsulated (grouped together). Here is a structure implementing signature INTSTACKSIG.

    structure IntStack: INTSTACKSIG =
    struct
      type intstack = int list;

      exception stackUnderflow;

      val emptyStk = [];

      fun push (e:int) (s:intstack) = (e::s);

      fun pop [] = raise stackUnderflow
	| pop (e::s) = s;

      fun top [] = raise stackUnderflow
	| top (e::s) = e;

      fun IsEmpty [] = true
	| IsEmpty (e::s) = false;

      fun extra ...
    end;

Because the signature does not mention function "extra", it will not be visible to anyone using IntStack. See our parser and lexers for PCF, for other examples where only a few of the components are exported. If the signature was left out of the declaration, ML will infer a signature that is printed when the structure is compiled.

Users can get access to components of structures in two ways. One can either qualify their names with the structure name, e.g.

  IntStack.push 12 IntStack.emptyStk;

or by "opening" the structure in a context:

  open IntStack;
  push 12 emptyStk;

The latter form is mainly useful if you want to save the effort of writing the prefixed structure name and there is no danger of ambiguity.

By attaching the signature with ":", we have declared what ML describes as a "transparent ascription" of IntStack to INTSTACKSIG. What this means is that in matching the type intstack's definition to the declaration of intstack in the signature, the details of intstack "leak" out into the environment. That is, if we open IntStack, we get the following response:

    - open IntStack;
    opening IntStack
      type intstack = int list
      exception stackUnderflow
      val emptyStk : intstack
      val push : int -> intstack -> intstack
      val pop : intstack -> intstack
      val top : intstack -> int
      val IsEmpty : intstack -> bool
    - push 5 [];
    val it = [5] : intstack

Thus, intstack is known to be the same as int list.

If that leakage of information is what is desired, then this is fine. But more likely we want to hide that information. We can do that with an "opaque ascription", which is written as follows:

structure IntStack:> INTSTACKSIG =
struct ...

Of course, if there are no type definitions in the module then it doesn't matter which version of ascription is used.

One can further restrict a structure, by giving it a new name and signature:

    structure ResIntStack: RESSTACKSIG = IntStack;

    structure ResIntStack:> RESSTACKSIG = IntStack;

This has no impact on the values defined, but further restricts the visibility so that only components declared in RESSTACKSIG are visible.

It is also worth noting that structures can be nested in ML, so that an internal structure can be used to help in defining another structure, yet it is not visible at all outside of the outer module.

ML goes farther by also introducing functors, which are structures that can take other structures as arguments.

Suppose we start with

    signature EQ =
      sig
	type t
	val eq : t * t -> bool
      end;

a signature with a type t and eq function defined on pairs of elements from t. Suppose we wish to build a functor that takes a structure like this and builds another structure which compares pairs for equality:

functor PairEQ(P : EQ) : EQ = struct
  type t = P.t * P.t
  fun eq((x,y),(u,v)) = P.eq(x,u) andalso P.eq(y,v)
end;

Then we can define:

    structure IntEQ : EQ = struct
      type t = int
      val eq : t*t->bool = op =
    end;
    structure IntPairEQ : EQ = PairEQ(IntEQ);

which creates structure IntPairEQ as a new EQ structure composed of pairs of integers.

There is a lot more we could talk about with modules in ML, but we will stop there.

In each of the languages discussed here, we have obtained the following key features of ADT's:

Encapsulation of all features of an ADT in one place
Information hiding so that a module has control over what it exports as well as over what it exports.

In all but Clu the interface and implementation modules could be written separately and compiled separately (that is not quite true of ML, but pretty close). Ada, Clu, and ML also provided ways of creating parameterized modules.

Subtypes

Lecture material on subtypes can be found in Chapter 5 of my new book, Foundations of Object-Oriented Languages: Types and Semantics, by Kim Bruce (Š2002 by MIT Press). The relevant material for this lecture is included in section 5.1. Other material will be relevant when we get to object-oriented languages, next. I will not bother to put detailed lecture notes on-line because the chapter covers the same material (and examples!).

Object-oriented programming languages

Roots in languages supporting ADT's

Biggest loss in moving from FORTRAN to Pascal is lack of support for modules with persistent local data.

Clu, Ada, and Modula 2 attempted to remedy this by adding clusters, packages, and modules.

In Ada & Modula 2, objects (i.e. packages, and modules) were late additions to an earlier paradigm (Pascal-like)

couldn't be instantiated dynamically
had no type or other method for organizing, despite similarity to records.
nonetheless provide better modules for building large systems

Called object-based languages.

Goals:

Stepping back a bit further - support development of high-quality software.

Qualities Desired in Software

Correctness: Should be able to guarantee correctness of code on legal input
Robustness: Should work in unusual cases, handle errors
Extensibility: Should be able to add features
Reusability: Should be able to use code in different circumstances
Compatibility: Should be able to combine modules

ADT languages provide reasonable support for all but extensibility (in particular if want minor extensions - but rest is same), some limitations on reuseability.

Object-oriented languages are an attempt to make progress toward these goals.

A programming language is object-oriented if:

It supports objects that are data abstractions (like Modula 2, Ada).
All objects have an associated object type (often called classes). (Bit different from Modula 2, Ada).
Classes may inherit attributes from superclasses. (Very different from Modula 2, Ada)
Computations proceed by sending messages to objects.
Routines may be applied to objects which are variants of those they are designed to be applied to (subtype polymorphism).
Support dynamic method invocation (will be explained later )

Simula 67 first object-oriented language - designed for discrete simulations.

Up until recently, Smalltalk best-known - Alan Kay at Xerox (now at Apple).

Gone through Smalltalk-72,-74,-76,-78,-80.

C++, object-oriented extensions to Pascal, C, LISP, etc.

One of nicest is Eiffel - discuss later (See Meyer's Object-Oriented Software Construction). Also Sather (public-domain variant of Eiffel). Of course Java is now becoming the most popular (and one of best).

Main idea of object-oriented programming:

Independent objects cooperate to perform a computation by sending messages (requests for action) to each other.

Object-oriented programming:

Object-oriented languages built around following concepts:

Object: like "internal representation of abstract data types - all data encapsulated in objects - first class!
Message: request for object to carry out one of its operations.
Class: template for set of objects - similar to type
Instance: object described by a class
Method: operation associated with an object - specified in its class.
Subtype: A subtype of B (A <: B) if A represents a specialization of B (e.g., cars <: vehicles, ColorPoint <: Point) - An element of a subtype can be used in any context in which an element of the supertype would work.
Subclass: An incremental modification or extension of a class and its methods. Methods not changed are inherited.

In more detail:

Objects are internal data abstractions - only accessible to outer world through associated procedures

hide representation from outer world
have an associated state (current contents of internal variables)
its methods have automatic access to its state, therefore no explicit parameter needed for self.

Object types

allow objects (modules) to be first-class citizens
allow use in assignment, parameters, components of structures
allow objects to be classified via subtyping.

Classes

templates for the creation of objects
contain definitions of all methods associated with these objects
can modify or extend by creating subclasses which inherits old methods

Most current OOL's identify object types and classes (in particular subtypes and subclasses).

See later this can lead to holes in typing system and/or restrictions in expressibility.

In typical object-oriented programming language, everything is an object.

Abstraction preserved since no object can make changes to the internal state of another object - though some languages don't enforce this - just send messages using methods in the public interface.

We will first investigate Java and Eiffel, as examples of object-oriented programming languages and then come back and discuss object-oriented languages in general, especially issues in type systems for object-oriented languages.

Back to:

CS 334 home page

Kim Bruce's home page

CS Department home page