CS 334

Click here to get an example of how recursion works in the environmentbased interpreter. It will make the most sense if you have a copy of the evaluation rules in front of you when you read it.
Assignment compatibility:
var x : hex; y : ounces;
Is x := y legal?
Original report said both sides must have identical types.
When are types identical?
Ex.:
Type T = Array [1..10] of Integer; Var A, B : Array [1..10] of Integer; C : Array [1..10] of Integer; D : T; E : T;Which variables have the same type?
> A, B and D, E only.
Structural not always easy. Let
T1 = record a : integer; b : real end; T2 = record c : integer; d : real end; T3 = record b : real; a : integer end;Which are the same?
Worse:
T = record info : integer; next : ^T end; U = record info : integer; next : ^V end; V = record info : integer; next : ^U end;
Different languages use different versions of type equivalence:
Two types are assignment compatible iff they
Things are more complicated in objectoriented languages. Then assignment is OK if type of source is a subtype of the receiver type.
Several languages allow overloading of procedures/functions/methods, while other allow overloading of operators. This is simply a syntactic convenience for the programmer, as the overloading goes away in compiled code.
It can be useful if the similar names make it easier for the programmer to remember operation names. For example, overloading the plus symbol for integer and real arithmetic is helpful for scientific users. However, it is important to only use overloading where the semantics of the operators or functions is substantially similar, otherwise it will confuse readers. For example, Java's use of "+" for string concatenation is likely more confusing than helpful, especially when programmers write expressions like: m + n + " are the answers", as the language interprets the first "+" as addition and the second as concatenation. See the text for details on how different languages support overloading. (C++ provides extensive support, but Ada's is the most flexible.)
Overloading mixed with type inference is not a good combination as how is a language (e.g., ML) to know how to interpret the type of
fun add x y = x + y;ML used to provide an error message if you typed this in, but now it instead interprets "+" as integer plus. Hence the type is int > int > int. If you want it to be real addition, you must include type information. The problem is that "+" is NOT a polymorphic operation  applicable to a large number of types using the same code. Instead it is simply providing the same name to DIFFERENT operations.
Typechecking programming languages is generally pretty straightforward. Typechecking rules can be written down in a way similar to the formal semantics we have been using. Let E be a "static type environment" that provides type information for identifiers. Thus (x : T) in E implies that x has type T in environment E. Type checking a program starts with an empty environment (no identifiers have yet been assigned types), but E is expanded every time an identifier is introduced. I've written some rules below for a typed version of PCF.
E  f : T > U, E  x : T  E  f x : U E + (x : T)  B : U  E  fn (x:T) => B : T > U E  x : T, if (x : T) in EThese can be seen as a specification of a typechecking algorithm in the same way that our rules for natural semantics were.
In general, typechecking for programming languages is very straightforward and efficient (essentially linear in the size of the term being checked in most cases). A more difficult problem is inferring the types of terms.
ML's type system was designed carefully in order to allow type inference to work. Here are some of the basic rules for type inference:
An identifier should be assigned the same type throughout its scope.
In an "ifthenelse" expression, the condition must have type boolean and the "then" and "else" portions must have the same type. The type of the expression is the type of the "then" and "else" portions.
A userdefined function has type 'a > 'b, where 'a is the type of the function's parameter and 'b is the type of its result.
In a function application of the form f x, there must be types T and U such that f has type T > U and x has type T, and the application itself has type U.
Here is an example of ML type inference (ignoring issues of pattern matching). Define
fun map f l = if l = [] then [] else (f (hd l)):: (map f (tl l))which will be internally translated to:
value map = fn f => fn l => if l = [] then [] else (f (hd l)):: (map f (tl l))Let's look carefully at the clues obtainable from the definition to determine the type of map.
By rule (3), map is assigned type 'a > 'b, for 'a and 'b type variables. Thus the type of f is 'a. Because the body of the function (the part after "fn f =>") is also a function, 'b = 'c > 'd, where the type of l is 'c. Now let's examine the if statement to see what more we can deduce.
Rule (2) above states that for an "ifthenelse" expression, then the condition must have type boolean and the "then" and "else" subexpressions must have the same type, and that type is the type of the entire term, 'd. Thus we get:
{f: 'a, l: 'c}  l = [] : boolean {f: 'a, l: 'c}  [] : 'd {f: 'a, l: 'c}  (f (hd l)):: (map f (tl l)) : 'dWe can make the following deductions from this.
From the first typing assertion we obtain 'c = 'e list for some type 'e, because [] has type 'e list and both sides of an equality must have the same type.
From the second, we get that 'd = 'g list for some type 'g. This again follows from the typing of []. Note that we don't assume that 'c and 'd have the same list type. Each time we get further information about the shape (rather than value) of a type, we introduce a new type variable.
From the third typing, we get (because we know 'd = 'g list) that f (hd l) has type 'g and map f (tl l) has type 'g list. Let's see what other information we can extract from f (hd l) having type 'g. Because this is a function application, and f has type 'a, we know that 'a = 'h > 'g for some 'h and hd l has type 'h. Because the type of l is 'c = 'e list, hd l has type 'e. Thus 'h = 'e.
Let's write down all of the constraints we have derived:
'c = 'e list 'd = 'g list 'a = 'h > 'g 'h = 'e
We can solve these to get
'a = 'e > 'g 'c = 'e list 'd = 'g listThus the type of map is ('e > 'g) > ('e list) > ('g list) where 'e and 'g are any types. We can write this more compactly by writing the type as ∀ 'e. ∀ 'g. ('e > 'g) > ('e list) > ('g list). If your browser doesn't render this properly, the first and third symbols are supposed to be "upside down" A's, standing for "for all".
There are three possibilities when the ML type inferencer tries to solve the system of constraints generated by the analysis of an ML expression:
It is overconstrained. Thus there is no solution, and the expression has a type error. This is the cause of error messages like:
 tl 7; stdIn:22.122.5 Error: operator and operand don't agree [literal] operator domain: 'Z list operand: int in expression: tl 7The error message above results from rule 4 above, and indicates that tl can only be applied to arguments of the form 'Z list, for 'Z a type variable, while the actual argument, 7, is of type int. The system could not solve the constraint 'Z list = int because no substitution for 'Z could make this true.
It is underconstrained. In this case there are many solutions. These solutions can arise in two ways: it may be ambiguous due to overloading (causing a type error or with the system choosing one interpretation of the overloaded operators) or it may be polymorphic, resulting in a type with type variables.
It is uniquely determined. In this case there is a unique solution and the expression has exactly one type.
The type of a function in ML is not allowed to contain a universal quantifier (∀) on the inside. All of these quantifiers must appear at the outer level. This means that a function may not be defined to take a polymorphic function as an argument, though it can be applied to a specialization of a polymorphic function.
Related to this is a type restriction that you may run into in your programming. An identifier introduced by a val binding (including the implicit "it") may not be given a polymorphic type unless the expression on the right side of the binding is a "value".
These values are similar to the ones used in this week's PCF interpreter homework. A value is something that cannot be further evaluated. Thus constants, lists, and function definitions are all values, but a function application is not a value.
For example, look at the following transcript of an ML session:
 fun double f x = f(f x); val double = fn : ('a > 'a) > 'a > 'a  tl; val it = fn : 'a list > 'a list  val dbleTl = double tl; stdIn:11.111.10 Warning: type vars not generalized because of value restriction are instantiated to dummy types (X1,X2,...) val dbleTl = fn : ?.X1 list > ?.X1 list  dbleTl [1,2,3]; stdIn:12.112.11 Error: operator and operand don't agree [literal] operator domain: ?.X1 list operand: int list in expression: dbleTl (1 :: 2 :: 3 :: nil)  double tl [1,2,3]; val it = [3] : int listdouble is defined as a polymorphic curried function that applies the first argument twice to the second argument. tl is a predefined polymorphic function returning the tail of a list. Applying double to tl results in an error, because the result is polymorphic, but the expression double tl is not a value. Hence the type warning shown above. Oddly, ML only prints the warning, but the resulting value is not usable as shown by applying dbleTl to [1,2,3]. On the other hand, writing double tl [1,2,3] causes no problems because the result is not polymorphic!
I don't want to go into detail into the reason for the "value restriction" as it is intended to avoid a problem with polymorphic references (variables that can hold polymorphic values), but I wanted you to see this in order to recognize what is going on when you see such an error. Various versions of ML have had different restrictions on typing in order to avoid problems with polymorphic references, but this one seems not to cause many problems in practice. In fact, the above problem with dbleTl can be solved by writing the definition as:
 fun dbleTl x = double tl x; val dbleTl = fn : 'a list > 'a listA moment's thought will show you that this defines the same function as before. However, because it is now given as a function definition (functions are always "values" in the technical sense above), rather than a val definition, means that it is not subject to the value restriction.
No semidynamic arrays. Result of 2 principles:
Type of actual parameters must agree w/ type of formals
Therefore, no general sort routines, etc.
The major problem with Pascal
e.g., type Boolean is (False, True)
Can overload values:
Color is (Red, Blue, Green) Mood is (Happy, Blue, Mellow)If ambiguous can qualify w/ type names:
Color(Blue), Mood(Blue)
i.e., Hex is range 0..15
Other attributes available to modify type definitions:
Accurate is digits 20 Money is delta 0.01 range 0.00 .. 1000.00  fixed pt!Can extract type attributes:
Hex'FIRST > 1 Hex'LAST > 15Can initialize variables in declaration:
declare k : integer := 0
type Two_D is array (1..10, 'a'..'z') of Realor "Unconstrained" (what we called semidynamic earlier)
type Real_Vec is array (INTEGER range <>) of REAL;Generalization of open array parameters of MODULA2.
Of course, to use, must specify bounds,
declare x : Real_Vec (1..10)or, inside procedure:
Procedure sort (Y: in out Real_Vec; N: integer) is  Y is open array parameter Temp1 : Real_Vec(1..N);  depends on N Temp2 : Real_Vec (Y'FIRST..Y'LAST);  depends on parameter Y begin for I in Y'FIRST ..Y'LAST loop ... end loop; ... end sort;Note Ada also has local blocks (like ALGOL 60)
All unconstrained types (w/ parameters) elaborated at block entry (semidynamic)
String type is predefined open array of chars:
array (POSITIVE range <>) of character;
Can take slice of 1dim'l array.
E.g., if
Line : string(1..80)Then can write
Line(10..20) := ('a','b',.'c','d','e','f','g','h','i','j')  gives assignment to sliceBecause of this structure assignment, can have constant arrays.
and dynamic properties  checked at run time
Example of dynamic are range, subscript, etc.
Specify dynamic properties by defining subtype. E.g.,
subtype digit is integer range 0..9;Subtypes also constrain parameterized array or variant record.
subtype short_vec is Real_Vec(1..3); subtype square_type is geometric (square)Subtypes do not define new type, add dynamic constraints.
Therefore can mix different subtypes of same type w/ no problems
Derived types define new types:
type Hex is new integer 0..15 type Ounces is new integer 0..15Now Hex, Ounces, and Integer are incompatible types: treated as distinct copies of 0..15
Can convert from one to other:
Hex(I), Integer(H), Hex(Integer(G))Derived types inherit operators and literals from parent type.
E.g., Hex gets 0,1,2,... +,,*,...Use for private (opaque) types and when don't want mixing.
Helped by removing dynamic features from def of type subrange or index of array.
Can now have open array parameters (also introduced in ISO Pascal).
Variants fixed
Name equivalence in Ada to prevent mixing of different types. E.g., can't add Hex and Ounce.
Can define overloaded multiplication such that if
l:Length; w:Width;then l * w : Area.
Back to:
kim@cs.williams.edu