CSCI 431 Lecture Notes - Abstraction
Abstraction
- Abstraction is the representation of an entity that includes
only the attributes of significance in a particular context.
- The purpose of abstraction is to simplify the problem solving and
programming process.
- There are 2 fundamental kinds of abstraction in contemporary
programming: process abstraction and data
abstraction.
- Historically data abstraction followed process abstraction; both
provide the programmer with the ability to create new data types and
operations on those types.
Process abstraction
- Subprograms are process abstraction
- A subprogram is a abstract operation defined by the programmer
- Subprograms provide a way to provide a computational process while
hiding the details of how it is done
- They are the basic building blocks out of which most programs are
constructed
Encapsulation
- Dividing programs into groups of logically related subprograms and
data, makes a large program more understandable, such groups are
called modules
- Each module performs a limited set of operations on a limited
amount of data
- When modules are designed so that they can be compiled separately
this increases efficiency
- An encapsulation is a grouping of subprograms and the
data that they manipulate which is independently compilable
- Information hiding is inforced by encapsulation
- The idea behind information hiding is:
- As much information as possible is hidden from the user
- The user is not permitted to directly manipulate the hidden
information
- Encapsulation is important in permitting easy modification of a
program
- An other important advantage of encapsulation is that it
facilitates code reuse
- Examples of encapsulation mechanisms (although not always
complete encapsulation mechanisms):
- The
COMMON
block in Fortran
- Nested blocks in Algol-60 like languages
- Ada packages
- C++ classes
Data abstraction
- In some programming languages subprograms are are not compilation
units
- Often subprograms do not permit complete encapsulation of the data
- For these reasons they are not adequate encapsulation structures
- New languages provide better facilities for specifying and
implementing entire Abstract Data Types (ATD's)
- An ADT is defined as having
- a set of data objects
- a set of abstract operations on those data objects
- encapsulation so that the implementation process is unavailable to
the user
- Examples: packages in Ada, classes in C++ and
Java
An Example
Consider implementing an a stack as an ADT
- Formulate the abstraction:
- Properties of the stack data type
- a container that holds data elements of a particular type
- it has a top element but no notion of a bottom (the bottom
can not be accessed)
- it can be empty or full
- Operations
- create a new stack
- push an element on the stack
- pop an element from the stack
- is empty
- is full
Implementation in Ada
- Encapsulation using packages
- specification package
- body package
- Information hiding using private types and limited
private types
- Are the implementation details hidden?
- Can stacks be used without knowledge of the
implementation?
The specification package:
package stack is
-- The visible entities, or the public interface
type STACKTYPE is limited private;
MAX_SIZE : constant := 100;
function EMPTY (STK : in STACKTYPE) return BOOLEAN;
procedure PUSH (STK : in out STACKTYPE; ELEMENT : in INTEGER);
procedure POP(STK : in out STACKTYPE);
function TOP(STK : in STACKTYPE);
-- The part that is hidden
private
type LIST_TYPE is array (1..MAX_SIZE) of INTEGER;
type STACKTYPE is
record
LIST : LIST_TYPE;
TOPSUB : INTEGER range 0..MAX_SIZE := 0;
end record;
end STACKTYPE;
The body package:
with TEXT_IO; use TEXT_IO;
package body STACKPACK is
function EMPTY(STK : in STACKTYPE) return BOOLEAN is
begin
return STK.TOPSUB = 0;
end EMPTY;
procedure PUSH (STK : in out STACKTYPE; ELEMENT : in INTEGER) is
begin
if STK.TOPSUB > = MAX_SIZE then
PUT_LINE("Error - Stack Overflow");
else
STK.TOPSUB := STK.TOPSUB + 1;
STK.LIST(TOPSUB) := ELEMENT;
end if;
end PUSH;
procedure POP(STK : in out STACKTYPE) is
begin
if STK.TOPSUB = 0
then PUT_LINE("Error - Stack Underflow");
else STK.TOPSUB := STK.TOPSUB - 1;
end if;
end POP;
function TOP(STK : in STACKTYPE) return INTEGER is
begin
if STK.TOPSUB = 0
then PUT_LINE("Error - Stack is Empty");
else return STK.LIST(STK.TOPSUB);
end if;
end TOP;
end STACKPACK;
A driver program
with STACKPACK, TEXT_IO;
use STACKPACK, TEXT_IO;
procedure USE_STACKS is
TOPONE : INTEGER;
STACK : STACKTYPE; -- Creates a new STACKTYPE object
begin
PUSH(STACK, 42);
PUSH(STACK, 17);
TOPONE := TOP(STACK);
POP(STACK);
...
end USE_STACKS;
Procedural abstraction
- Two components: specification and implementation
- A subprogram represents a mathematical function that maps each
particular set of arguments into a particular set of results.
- If a subprogram returns a single data object as a result it is
typically called a function
- A formal specification of a procedure can be formulated in terms
an abstract model. We will discuss the specification of a procedure
using axiomatic semantics later.
- The specification of a procedure in a programming language
includes the following:
- the name of the procedure
- the signature (also called the prototype) giving the
number of parameters, their order, data types, and the number of
results, their order and data type.
- the actions performed by the subprogram
- In addition, some languages include a keyword in
the declaration such as procedure or function
- Examples:
- In Pascal:
function Fn(X: real; Y:integer): real;
- In C:
void Sub(float X, int Y, float *Z, int *W);
- In Ada:
procedure Sub(X: in real; Y: in integer; Z: in out real; W: out
boolean)
Parameters and parameter transmission
- Parameters offer a way of sharing data; they are an alternative to
the use of non-local environments
- The term actual-parameter refers to the calling
argument and formal-parameter refers to the local data
object belonging to the subprogram
- correspondence between actual parameters (calling arguments) and
formal parameters can be established in either of two ways:
- based on position
- based on name as in Ada:
Sub(Y => B, X => 27);
- most language use positional correspondence exclusively
Methods for transmitting parameters
- The most common methods for parameter transmission are:
- call by value
- call by reference
- call by name
- call by value-result
- call by result
- call by constant value
Call by value
- The actual parameter is copied into the location identified with
the name of the formal parameter
- Any changes made to the formal parameter during the execution of
the subprogram are lost when the subprogram terminates.
- Example in Ada:
procedure p(x:in integer);
Call by reference
- Perhaps the most common parameter transmission method
- A pointer to the location of the data object is made
available to the subprogram. (The data object does not change positions
in memory, rather the formal parameter refers to the same memory
location as the actual parameter.)
- Changes made to the formal parameter affect the actual parameter
as well
- Example in Pascal:
procedure p(var x: integer);
- Example in C:
procedure p(int *x);
or
procedure p(int &x);
- Call by reference creates aliases which can lead to
problems
Call by name
Call by result
- This is used to transmit only the result back from a subprogram
- The initial value of the actual parameter makes no difference and
can't be used by the subprogram
- The formal parameter is a local variable with no initial value
- When the subprogram terminates the final value of the formal
parameter is assigned as the new value of the actual parameter
- Example in Ada:
procedure p(x:out integer);
Call by value-result
- Call by value-result is in effect a combination of call-by-value
and call-by-result.
- The value of the actual parameter is used to initialize the formal
parameter which acts like a local variable.
- When the program terminates the value of the formal parameter is
copied to the actual parameter.
- Sometimes called "pass-by-copy"
- Example in Ada:
procedure p(x:in out integer);
Call by constant value
- In this case, no change in the formal parameter is allowed during
execution of the subprogram---the formal parameter acts as a local
constant
- The actual parameter establishes the starting value of the formal
parameter
Question:
What happens when you pass an expression by reference,
such as in the call sub(&(a+b), &b); ?
Subprograms as parameters
- In many languages, a subprogram may be transmitted as an actual
parameter
- Example in Pascal:
procedure sub(x: integer; function R(y,z: integer): integer);
- The function sub may be called with a function name as its second
argument, e.g.: sub(27, fun1);
- Within sub the function fun1 may be invoked using the formal
parameter:
R(2,4);
- There are two major problems associated with subprogram
parameters:
- static type checking
- free variables (variables with no local binding)
Free variables in subprograms passed as parameters
- Suppose that a subprogram
fun
that contains a nonlocal
reference is passed as parameter from a calling program P to a called
program Q.
- What environment should be used to establish the value of the
non-local variable when
fun
is called?
- The rule: A non-local reference should mean the same thing during
execution of the subprogram passed as a parameter as it would if
the subprogram were invoked at the point where it appears as an actual
parameter
- To implement this rule the static pointer for the subprogram is
part of the information transmitted with a subprogram parameter
Implementation of Issues
- Remember that with call-by-value, values are passed to the
subprogram, and, with call-by-reference, addresses are passed to the
subprogram.
- In general, parameter transmission is handled using the run-time
stack as discussed during the last class.
- While most modern architectures use a run-time stack similar to
that discussed in class there are variations particularly in RISC
architectures.
- Below we will look at the details of three current architectures:
- An Intel PC running Windows 95
- DEC alpha workstation running Digital UNIX
- Sun SPARC workstation running Solaris
Example program
#include < stdio.h>
int avg8(int a, int b, int c, int d,
int e, int f, int g, int h)
{
return((a+b+c+d+e+f+g+h+4)/8) ;
}
int avg3(int a, int b, int c)
{
return((a+b+c+2)/3) ;
}
void main()
{
int i, v[8] ;
for (i=0; i < 8; ++i)
scanf("%d", &v[i]) ;
printf("%d %d\n",
avg8(v[0], v[1], v[2], v[3], v[4], v[5], v[6], v[7]),
avg3(v[0], v[1], v[2])) ;
}
Calling conventions in Windows 95 on an Intel PC
- The Intel architecture has a limited "sparse register set".
The calling convention is a classical stack-based approach.
Passed arguments are placed on the stack in left-to-right
order.
- In the call of avg3, the variables v[0], v[1], v[2] are moved
into registers (with the mov instruction) and are then
pushed onto the stack by the calling procedure.
Notice the order of placement -- v[2], v[1], and finally v[0].
mov eax, DWORD PTR _v$[ebp+8] ;; load v[2]
push eax
mov ecx, DWORD PTR _v$[ebp+4] ;; load v[1]
push ecx
mov edx, DWORD PTR _v$[ebp] ;; load v[0]
push edx
call _avg3
- In the called procedure, parameters are read from the
stack and loaded into registers as needed. The
called procedure returns four-byte results in register EAX.
- On return, the calling procedure is responsible for
popping the stack.
- MicroSoft's Visual C++ compiler also supports a
"fastcall" convention in which up to two parameters
are passed in registers.
Calling conventions in Digital Unix on Alpha
- The Alpha chip has a large register set.
- Six registers (16 to 21) are used to pass the first
six arguments to the routine. In those rare
instances in which more than six arguments are
passed, the additional arguments are stored
on the stack. Values are returned from
the call in register 0.
- The code for calling avg8 looks something like:
ldl $16, 48($sp) ;; load v[0]
ldl $17, 52($sp) ;; load v[1]
ldl $18, 56($sp) ;; load v[2]
ldl $19, 60($sp) ;; load v[3]
ldl $20, 64($sp) ;; load v[4]
ldl $21, 68($sp) ;; load v[5]
ldl $7, 72($sp) ;; load v[6]
stq $7, ($sp) ;; store on stack
ldl $8, 76($sp) ;; load v[7]
stq $8, 8($sp) ;; store on stack
Calling conventions in Solaris on Sun SPARC
- Like the Alpha, the RISC-based SPARC passes up to
six parameters in registers and places other
parameters on the stack. But, one unusual feature
of the SPARC architecture is its register windows.
- The SPARC processor has 32 registers -- 8 global
registers and 24 registers in register windows.
These 24 are organized in three groups of eight.
- Group 1 -- 8 to 15 -- out registers
- Group 2 -- 16 to 23 -- local registers
- Group 3 -- 24 to 31 -- in registers
- The calling procedure loads its first six parameters into the
"out" register set. [Incidently, the two other "out" registers are
used for the stack pointer and return address.]
- During the process of making the call, the register window is
moved. The "out" registers of the caller procedure become the "in"
registers of the called procedure. The called procedure also gets a
new set of "local" registers and "out" registers (which it uses in its
own calls).
- On procedure return, the window move is reversed and the caller
has its old set of registers -- except that the first "out" register
will contain the value returned by the called procedure.
- Thus a call will be proceeded by code storing into registers
%o0, %o1, etc. and procedures generally start with code retrieving
values from $i0, %i1, etc.
ld [%fp-36],%l0 ;; load v[0]
ld [%fp-32],%l1 ;; load v[1]
ld [%fp-28],%l2 ;; load v[2]
ld [%fp-24],%l3 ;; load v[3]
ld [%fp-20],%l5 ;; load v[4]
ld [%fp-16],%l6 ;; load v[5]
ld [%fp-12],%l7 ;; load v[6]
ld [%fp-8],%l4 ;; load v[7]
mov %l0,%o0 ;; move local register to out resister
mov %l1,%o1 ;; move local register to out resister
mov %l2,%o2 ;; move local register to out resister
mov %l3,%o3 ;; move local register to out resister
mov %l5,%o4 ;; move local register to out resister
mov %l6,%o5 ;; move local register to out resister
st %l7,[%sp+92] ;; store on stack
st %l4,[%sp+96] ;; store on stack
call avg8