Object Oriented Programming (OOP) So far the languages that we encountered treat data and computation separately. In OOP, the data and computation are combined into "object". This has several benefits: -- more convenient: collects related information together, rather than distributing it. Example: C++ iostream class collects all I/O related operations together into one central place. In contrast, the I/O library in C consists of many distinct functions such as getchar, printf, scanf, sscanf, etc. -- centralizes and regulates access to data. If there is an error in the program that corrupts object data, we need to look for the error only within the computations (aka member functions) associated with the object. Contrast this with the typical situation in C programs: access and updates to fields of a datastructure are distributed all through the program, so we need to scan all of a program to identify potential errors. -- promotes reuse. (a) by separating interface from implementation. The implementation of the member functions is irrelevant from the point of view of clients of the object. (Client code = code that uses an object and its member functions.) Thus, we can replace the implementation of an object very easily, without making any changes to client code. Contrast this with C, where the implementation of a datastructure such as a linked list is integrated into the client code, making such changes very difficult if not impossible. (b) by permitting extension of new objects vis inheritance. Inheritance allows a new class (class = type of an object) to reuse the features of an existing class. For instance, one may define a doubly linked integer list class by inheriting an reusing the functions provided by a singly linked list class. The aspect of "centralizing/regulating access to data" is generally called encapsulation; the aspect of separating implementation of an object from its interface is generally called in "information hiding." Note that these two terms overlap to some extent. In object-oriented languages (OOL), class is a type and includes data and operations together. An object is an instance of class. It specifies members, which consists of variables and functions. The member variables are of two types: class variables or static variables and instance variables. Each object has its own copy of instance variables, whereas class variables are chared across all objects of a class. Instance variables are similar to field names of a struct, e.g., if a C-structure has a field called a, and b and c are structure variables of this type, b.a and c.a refer to distinct locations in memory. Class variables, on the other hand, are similar to global variables. (Access to these variables may still be restricted using the private or protected keyword to the member function.) Similarly, member functions are of two types: statically dispatched (or statically bound) and dynamically dispatched (or dynamically bound). The statically dispatched functions are declared using the keyword "static" whereas the dynamically dispatched functions are declared using the keyworkd "virtual". Access to members of an object is regulated in C++ using three keywords: private, protected and public. Private members can be accessed only by member functions associated with the class. They may not be directly accessed by outside functions. A protected designation is a bit less restrictive: it allows the member functions of any subclass of a given class to access such members. The public keyword identifies those members that can be called directly by any piece of code. A typical convention is C++ is to make all data members private. Most member functions are public. e.g., consider a list that consists of integers. The declaration for this could be : class IntList { private: int elem; // element of the list IntList *next ; // pointer to next element in the list public: IntList (int first) ; // this function (that has the same // name as the class) is called // "constructor". It is called // automatically when a new object of // this class is created. ~IntList () ; // this function (that has ~ and the class // name) is called "destructor". It is called // automatically when an object of this class // is about to be destroyed. void insert (int i) ; // insert element i int getval () ; // return the value of elem IntList *getNext (); // return the value of next } We may define a subclass of IntList that uses doubly linked lists as follows: class IntDList : IntList { // this denotes that IntDList inherits // from class IntList private: IntList *prev ; // pointer to previous element, the pointer // to next element and element itself are in IntList public: IntDlist(int first); // Constructors need to be redefined ~IntDlist(); // Destructors need not be redefined, but typically // this is needed in practice. In particular, a // destructor may need to free up storage allocated to // pointers. Since Dlist uses an additional pointer, // additional free operations may be needed, and hence // the need for a new destructor. // Most operations, such as getval and getNext are inherited // from IntList. insert (int) ; // But some operations, such as insert, may have to be redefined. // An insert operation now has to update two pointers, instead of one. // Also, we need an accessor for the previous pointer. IntDList *prev(); } A key principle that applies to OOL is as follows: "In any operation that expects an object of type T, it is acceptable to supply object of type T', where T' is subtype of T". This subtype principle must be strictly adhered to. The subtype principle enables OOL to support subtype polymorphis: the same piece of code can be reused with different types of objects, as long as they all belong to subtypes of a base type. For instance, the following function will work with any object whose type is a subtype of IntList. void q (IntList &i, int j) { ... i.insert (j) ; } When we apply the subtype principle, we require that q work properly, regardless of whether it was called with an IntList or IntDList argument. However, note that we said the insert operation works differently on these two types. Use of IntList::insert on IntDList object will likely corrupt it, since the prev pointer would not have been appropriately initilaized. Thus, in order for the subtype principle to be observed, it is essential that i.insert refer to IntList::insert when i is an IntList object, and IntDList::insert function when i is an IntDList. This requires a dynamic association between the name "insert" and the its implementation. This sort of dynamic binding is achieved in C++ by declaring a function be virtual. Thus the definition of insert in IntList should be modified as follows: virtual void insert(int i) ; // insert element i Note that with dynamic binding, we are getting the effect of overloading rather than parametric polymorphism. In particular, the insert function implementation is not being shared across subtypes of IntList, but its name is shared. This enables client code to be reused regardless of the argument type, but the implementation of insert function is different between IntList and IntDList for reasons mentioned above. (To see dynamic binding as overloading, we need to eliminate the "syntactic sugar" used for calling member functions in OOL: instead of viewing it as i.insert(...), we whould think of it as a simple function insert(i,...) that explicitly takes an object as an argument.) The key properties of OOL are: (a) encapsulation (b) inheritance+dynamic binding