CSCI 431: Abstract Data Types

The Development of Data Types

1957-1960 FORTRAN

primitive data that mimics machine hardware

1960-1963 COBOL, PL/I

Extend primitive data with collections of primitive objects, records

1968-1970 Pascal, ALGOL-68

Extends functionality with new types.
Create functions that use those new types.

1972-1976 Age of encapsulation

Encapsulation and information hiding
Create data type signatures
Overloading

1980-1995 Object Orientation

Smalltalk-80, C++, ML, FORTRAN-90, Ada-95

A Motivating Example

Say a program is being constructed to deal with a data base organising a school. It would be sensible to first define the major data components: a student, a staff member, a course, a list of students, a list of staff members, a list of courses. Then the common operations could be defined: enter a student, delete a student, assign/remove a staff member to a course, assign/remove a student to a course.

Ideally we would like the various data structures to only be handled as parameters being passed to subprograms. So that once the subprograms are written, it is no longer neccesary to know the structure of the data types.

Thus, we would develop a number of abstract data types, for example a student. that may be added to a list, removed from a list, be added to a course, may be printed, may be assigned a mark for a course.

Once the basic code is written, ideally their should be no obvious difference between an abstract data type and one of the provided data types.

So to develop an abstract data type, a language must provide

a means to define data objects
a means to define abstract operations on those objects
a means to encapsulate the data such that it may only be manipulated by the abstract operations.

Abstract Data Type (ADT) :

1. a set of data objects
2. a set of abstract operations (functions) that operate on data of that type
3. encapsulation of the type in such a way that the user of the type cannot manipulate data objects of the type except by use of the operations defined
For example:
type student:
Operations:
SetName: name x student -> student
GetName: student -> name
SetCourse: course x student -> student
GetCourse: student -> course
GetCredits: course -> integer

In this example, Name and Course are undefined. They are defined by some other abstraction.
Note that we have no idea HOW a student object is stored, nor do we care.
All we care about is the behavior of the data according to the defined functions.

Information Hiding

Information hiding can be built into almost any language - a program design issue

C example:

typedef struct { ... } TypeA;
typedef struct { ... } TypeB;
typedef struct { ... } TypeC;
P1 (TypeA X; TypeB Y) { ... } - Implements something for TypeA
P2 (TypeC U; TypaA V) { ... } - Implements something for TypeC

Problems with this structure:

In P2 can write: V.i to access components of TypeA object

But advantage is that it can be done in Pascal or C

Encapsulation - a language design issue

We will look at mechanisms to enforce information hiding (Smalltalk, C++, Ada).
We will call this enforcement encapsulation. It means that the user of the abstraction
1. does not need to know the hidden information in order to use the abstraction, and
2. is not permitted to directly use or manipulate the hidden information

Encapsulated data types

Only type name and operations are visible outside of the defining object.

Example: STUDENT_RECORD is type
Externally visible:
proc SetName(STUDENT_RECORD, NAME)
func GetName(STUDENT_RECORD) : NAME
. . . other operators
Internal to module defining type:
NAME string(20);
GPA float;
ADDRESS string(50);
SCHEDULE[10] COURSE_TYPE;

Implementation Models of Abstract Data Types

"Usual" implementation:

package RationalNumber is
type rational is record - - User defined type
num: integer;
den: integer
end record;
procedure mult(x in rational; y in rational; z out rational); - - Abstract operation
end package;

package body RationalNumber is - - Encapsulation
procedure mult(x in rational; y in rational; z out rational)
begin
z.num := x.num * y.num;
z.den := x.den * y.den;
end;

end package;

var A: rational;
A.num := 7; - -But no enforcement of encapsulation.
Any procedure has access to components of type rational.
Let's look at alternative model to enforce encapsulation.

Private Types

package RationalNumber is
type rational is private; - - User defined type
procedure mult(x in rational; y in rational; z out rational); - - Abstract operation
private
type rational is record - - User defined type
num: integer;
den: integer
end record;
end package;
package body RationalNumber is - - Same as before
procedure mult(x in rational; y in rational; z out rational)
begin
z.num := x.num * y.num;
z.den := x.den * y.den;
end;
end package;
var A: rational;
A.num := 7; - - Now illegal. Private blocks use of num and den.
What is role of "private"? - Any declarations in private part is not visible outside of package
This solution encapsulates and hides implementation details of rational.

Encapsulation by Subprograms

Specification

Consider a function with parameters
function do (a: in float; b: in integer) return integer
its signature is
do: float x integer --> integer

A procedure
procedure do2 (a: in out float; b: in out integer) is
do2: float x integer --> float x integer

so these are very similar to the built-in operations.

There are some problems with specifying subprograms, they may

alter or reference non-local data (i.e. global variables)
for some parameters, the subprogram may terminate in some non-typical manner, i.e. it crashes
it may be history sensitive. some local variables may retain their values between invocations.

Subprogram Invocation

When a procedure runs, it needs a certain amount of memory, for house keeping, storing parameters and storing local variables. Languages which do not allow recursion may allocate memory for subprograms statically, whereas those that do allow recursion must allocate it dynamically.

The subprogram definition serves as a template from which the activations may be created.

There is a similarly between the (type definition / data object) relationship and the (procedure definition / procedure call) relationship
A subprogram activation record is a type of data object.
It is a block of storage containing:

Dynamic part (activation record)
- storage for parameters - accessed by offset
- storage for results
- storage for local variables
Static part (code segment) - all invocations share same copy
- storage for literals and defined constants
- storage for executable code

The Static Part

Prologue: block of code inserted by translator to
1. setup activation record
2. transmit parameters
3. creation of linkages for nonlocal references
4. Other housekeeping activities
Epilogue:
1. return results,
2. free storage for activation record
3. restore status of caller

The Dynamic part

How is the activation record built?
1. part by caller - parameter, RA, dynamic link
2. part by called - locals, static links
Activation record:
1. local variables
2. passed parameters
3. return address
4. temporary space for expression evaluation
5. return value

Generic Subprograms (in Ada)

It's often useful to first create a more generic version of a subprogram or package and then use that generic version to create more specific subprograms or packages. Ada's capability to do this is called a generic, and it's the same thing as C++'s templates.

It's probably easiest to understand this by example. First, let's write a procedure to swap two Integers:

  -- Here's the declaration (specification):
  procedure Swap(Left, Right : in out Integer);

  -- .. and here's the body:
  procedure Swap(Left, Right : in out Integer) is
    Temporary : Integer;
  begin
    Temporary := Left;
    Left := Right;
    Right := Temporary;
  end Swap;

Swap is a perfectly fine procedure, but it's too specialized. We can't use Swap to swap Floats, or Characters, or anything else. What we want is a more generic version of Swap, but one where we can substitute the type "Integer" with a more generic type. A generic version of Swap would look like this:

  -- Here's the declaration (specification):
  generic
    type Element_Type is private;
  procedure Generic_Swap(Left, Right : in out Element_Type);

  -- .. and here's the body:
  procedure Generic_Swap(Left, Right : in out Element_Type) is
    Temporary : Element_Type;
  begin
    Temporary := Left;
    Left := Right;
    Right := Temporary;
  end Generic_Swap;

In general, to create a generic version of a subprogram (or package), write the subprogram using a few generically-named types. Then precede the subprogram or package with the keyword ``generic'' and a list of the information you'd like to make generic. This list is called the generic formal parameters; this list is like the list of parameters in a procedure declaration.

To use a generic subprogram (or package), we have to create a real subprogram (or package) from the generic version. This process is called instantiating, and the result is called an instantiation or instance. For example, here's how to create three Swap procedure instances from the generic one:

  procedure Swap is new Generic_Swap(Integer);
  procedure Swap is new Generic_Swap(Float);
  procedure Swap is new Generic_Swap(Character);

Note that when you instantiate a generic, you ''pass'' types in the same way that you pass parameters to an ordinary procedure call.