CSE 428: Lecture 10
Abstract Data Types
All modern high-level languages allow mechanisms of data-abstraction,
i.e. mechanisms for enriching the language with
new types and operation on them.
"Abstraction" here means that it is possible to define
new types and operations in such a way that the user
(a programmer using the new types) might use them
just as he would use the other (primitive) data types of
the language.
It is beneficial to separate the concepts
of "specification" and
"implementation" of an ADT.
- The specification is the abstract description
of the type and the behaviour of the operations.
It should be implementation-independent
and even language-independent. It should give the user
all the information necessary to use the ADT, but no more.
- The implementation, on the other hand, is the
concrete representation of the elements of the
new type in terms of existing types, and the definition
of the operations as functios or procedures.
The implementation
should of course satisfy the specification, i.e. the behaviour
of the operations, their types, etc.
should be those prescribed by the specification.
The user should use the ADT only in the ways allowed
by the specification.
In this way, even if the implementation changes,
his programs would not need to be modified.
Furthermore, he can use the properties of
the specification (if provided) to reason about
the correctness of the programs.
Some languages, like Pascal Standard and C, allow
the definition of new types, but do not provide any mechanism
for data protection. Namely, there is no way to "shelter" the implementation,
i.e. to forbid the user to access the ADT via the operations allowed
on the implementation. This practice, of course, violates
the principle of the ADT and nullifies the advantages mentioned above.
More modern languages, like Modula 2 and C++,
have introduced mechanisms for hiding the implementation
and making it externally inaccessible
(See the book of Sethi, Ch. 6).
We illustrate the concept of ADT by showing the specification and
implementation of the type "simple list of integers"
in a Pascal-like language.
Specification
In an abstract sense, a list is a sequence of "nodes",
where each node contains an information (an integer number in this case).
"Simple" here means that the only operations of the ADT are:
- the creation of the empty list, i.e. the list with 0 nodes (emptylist),
- the test whether a list is empty (is_empty),
- the addition of a new node in front of a list (cons),
- the access to the information in the first node (head),
- the list obtained by removing the first node (tail).
We now specify more precisely the type of each operation
(interface). We use the
following notation:
- f: () -> T means that f has 0 arguments
and result of type T.
- f: (T1) -> T2 means that f has 1 argument of type T1 and
result of type T2.
- f: (T1) -> T2 means that f has 2 arguments, of type T1 and T2
respectively, and result of type T.
- ... etc.
Having in mind this notation, we define the type of the operations on
simple lists as follows:
- emptylist: () -> list
- is_empty: (list) -> boolean
- cons: (integer,list) -> list
- head: (list) -> integer
- tail: (list) -> list
The above is the "abstract definition" of symple lists,
i.e. the specification of the ADT.
Note that we don't say anything about the implementation here.
We might add to this specification the property that
lists are non-circular structures, i.e. should contain
no loop.
Actually, real specifications of ADT's should
give the specification of the algebra,
i.e. the semantics of the operations, in a more detailed and formal way
than our abstract description..
Furthermore, a good specification should include
any property that might be relevant for the programmer.
However, we won't go in details about formal methods for
ADT specifications, since it would be out of the scope
of this course.
Implementation
As mentioned above,
the implementation of an ADT consists of two parts:
- concrete representation of the elements of the new type
in terms of existing types,
- definition of the operations as functions or procedures
Representation of lists
One possible approach to the concrete representation of lists
is by using records and pointers.
In this approach, we need
to maintain in a node not only the information,
but also the pointer to the next node in the sequence.
We will call the concrete counterpart of nodes "elements"
to avoid confusion.
Thus an element will be a record, with a field "info" of type
integer, and a field "next" of type pointer to element.
A list will then be just a pointer to the first element of the sequence.
The definition of the type list, in a Pascal-like language, would
then be:
type list = ^element;
element = record
info : integer;
next = list
end;
Definition of the operations
function emptylist : list;
begin
emptylist := nil
end;
function is_empty(L:list): boolean;
begin
if L = nil then is_empty := true else is_empty := false
end;
function cons(x:integer, L:list): list;
var aux : list;
begin
new(aux);
aux^.info := x;
aux^.next := L;
cons := aux
end;
function head(L:list): integer;
begin
if L = nil
then head := 0 /* error */
else head := L^.info
end;
function tail(L:list): list;
begin
if L = nil
then tail := nil /* error */
else begin
tail := L^.next;
dispose(L) /* We might want to eliminate this instruction */
end /* dispose(L), because it causes side effects */
end; /* on other lists sharing L. */
Note that the property that
lists are non-circular structures
is satisfied by this implementation:
it is not possible to create
circular lists by using only
the functions above.
Using the ADT "simple list"
A correct use of the ADT should use only the
operations of the interface, i.e. emptylist, isempty, cons, head and tail.
For instance, consider below two possible definitions of the append
function: The first respect this principle, the second doesn't.
Of course, if the language offered a mechanism to protect the
ADT, the second could not even
be written.
"Good" append
function append(L1,L2:list): list
begin
if is_empty(L1)
then append := L2
else append := cons(head(L1),append(tail(L1),L2))
end;
"Bad" append
function append(L1,L2:list): list
begin
if L1 = nil
then append := L2
else append := cons(L1^.info,append(L1^.next, L2))
end;
Note that the second definition is not implementation-independent:
if we change the implementation of lists, then the second definition
will not be valid anymore. The first one, on the contrary, will
still be valid, provided of course
that the new implementation respects the specification.