Recursive types --------------- A recursive type is one that is defined in terms of itself. We may attempt to define a recursive type declaration for list as follows in C: struct IntList { int hd; intList tl; } Unfortunately, this does not work. The problem is that this defn corresponds to an infinite list -- there is end, because there is no way to capture the case when the tail has the value "nil". We need to express the fact that the tail can either be nil or be itself a list. We may use variant records to accomplish this as follows: TYPE charlist = RECORD CASE IsEmpty: BOOLEAN OF TRUE: /* empty list */ | FALSE: data: CHAR; next: charlist; END END Still, languages that support variant records will not permit this declaration. The problem still remains that one cannot, in advance, predict the amount of storage needed to store a charlist. Thus, imperative languages require that you use a pointer type to implement recursive type, as follows: struct IntList { int hd; IntList *tl; } Now, tl can either be a NULL pointer, which corresponds to the empty list, or point to a nonempty list value. Now, IntList structure occupies only a fixed amount of storage Direct definition of recursive types IS supported in SML using datatype declarations. For instance, a binary tree type can be declared as follows: - datatype intBtree = LEAF of int | NODE of int * intBtree * intBtree; datatype intBtree = LEAF of int | NODE of int * intBtree * intBtree We are defining a binary tree type inductively: Base case: a binary tree with one node, called a LEAF Induction case: construct a binary tree by constructing a new node that sores an integer value, and has two other binary trees as children We may construct values of this type as follows: - val l = LEAF(1); val l = LEAF 1 : intBtree - val r = LEAF(3); val r = LEAF 3 : intBtree - val n = NODE(2, l, r); val n = NODE (2,LEAF 1,LEAF 3) : intBtree Types can be mutually recursive. Consider the following type declaration: -datatype expr = PLUS of expr * expr | = PROD of expr * expr | = FUN of (string * exprs) | = IVAL of int =and = exprs = EMPTY | = LIST of expr * exprs; datatype expr = FUN of string * exprs | PLUS of expr * expr | PROD of expr * expr datatype exprs = EMPTY | LIST of expr * exprs The key word "and" is used for mutually recursive type definitions. ("and" is also used for mutually recursive function defitions.) We could also have defined expressions using the predefine list type: - datatype expr = PLUS of expr * expr | PROD of expr * expr | = FUN of string * expr list; datatype expr = FUN of string * expr list | PLUS of expr * expr | PROD of expr * expr Examples: The expression "3 + (4 * 5)" can be represented as a value of the above datatype expr as follows: val v3 = IVAL(3); val v5 = IVAL(5); val v4 = IVAL(4); val pr = PROD(v5, v4); val pl = PLUS(v3, pr); The following picture illustrates the structure of the value "pl" and how it is constructed from other values. pl ------> PLUS / \ / \ v3 ---> IVAL PROD <----- pr | /\ | / \ 3 /->IVAL IVAL <--- v4 / | | v5 | | 5 4 Similarly, "f(2,4,1)" can be represented as: val a1 = EMPTY; val a2 = ARG(IVAL(4), a1); val a3 = ARG(IVAL(2), a2); val fv = FUN("f", a3); Note the use of "expr list" to refer to a list that consists of elements of type "expr." Polymorphism: Ability of a function to take arguments of multiple types. ------------- The primary use of polymorphism is code reuse. Functions that call polymorphic functions can use the same piece of code to operate on different types of data. Overloading (aka adhoc polymorphism): Same function NAME is used to represent different functions, each of which may take arguments of a specific type. e.g., operator '+' is overloaded in most languages so that they can be used to add integers or reals. But the integer addition operation is very different from the addition operation on reals. Arbitrary function names can be overloaded in C++, but not in C. All virtual functions are in fact overloaded functions. Parametric polymorphism: The same function works for arugments of different types, same code is reused for arguments of different types. With overloading, you get resue of "client" code, i.e., code that calls an overloaded function. template Type min(const Type* a, int size, Type minval) { for (int i = 0; i < size; i++) if (a[i] < minval) minval = a[i]; return minval; } The above example shows how the same code can be used for computing the minimum value from arrays of any type. The only requirement is that the type support the comparison operator "<" and the assignment operator "=". The above function definition is parameterized wrt to the type "class C", and hence the term "parametric polymorphism". The constraint regarding this type, namely that it support the operations "<" and "=", is implicit from the body of the template function. Unlike C++, C and Java do not support templates. (There is a proposal for adding template types to Java.) Code reuse with parametric polymorphism and overloading: It is clear that with parametric ploymorphism, the same function body is reused to deal with different datatypes. The basic property of a parametrically polymorphic function is that it does not need to "look below" a certain level of some data structure. For instance, the "min" function above did not need to look inside of each array element. Similarly, one can think of "length" and "append" functions that can operate on linked lists of all types, without looking at the specific element type. With oveloading, there is no reuse of the overloaded function, since there is in fact a different function body corresponding to each argument type. However, client code that calls a overloaded function can be reused. For instance, let C be a class, with subclasses C1,...,Cn. Also let "f" be a virtual method of class C (and hence its subclasses). We can now write client code that can apply the function uniformly to elements of an array, each of which is a pointer to an object of type C1,...,Cn. void g(int size, C *a[]) { for (int i = 0; i < size; i++) a[i]->f(...); } Now, the body of function g (which is a client of the function f) can be reused for arrays that contain objects of type C1 or C2 or ... or Cn, or even a mixture of these types.