Lecture Notes 3 – Data abstraction

CSS 501 – Data Structures and Object-Oriented Programming

Reading for this lecture: Carrano, Chapter 1

To be covered in this lecture:

Data abstraction
Templates

Abstraction and information hiding

In using a problem solving approach to software engineering, the solution to a problem usually consists of both an algorithm and a collection of data. The algorithm is the step-by-step procedure used to solve the problem. The algorithm operates on the collection data, which we will discuss shortly.

In designing a modular solution (algorithm) to your computing problems, each module should be independent of the method that the other modules use to solve their part of the problem. You should be able to change the code in a module without affecting any of the other modules, as long as the interface remains the same. This isolates the parts of the problem from each other, so that you only need to worry about a small part of the overall problem at any time. The separation of the purpose of a module from its implementation is called functional abstraction. This principle is useful for writing individual functions, even if they are only a small part of a module. If you know the input and output, pre-conditions and post-conditions of the function, then you don’t need to know how it works to use it as part of a specific module or larger piece of software. This means that you need to put thought into the interface to each module (and each function) prior to implementing it, so that you don’t later discover that your interface is insufficient. If you find you need to change the interface, then any other module that interacts with the one you are changing must be fixed.

Similar to functional abstraction, we will use data abstraction. An abstract data type or ADT is a collection of data and a set of operations on the data (for example, a list of records) that does not specify how the data is stored or how the operations accomplish their functions. An ADT is implemented within some programming language (for example, a C++ class), but the implementation is not part of the ADT. The ADT can be used in software design without knowing how it will be implemented. ADTs and algorithms are the abstract tools that you use for designing software, while classes, data structures and other code represent the implementation of the ADTs and algorithms. Examples of abstract data types are lists, stacks, queues, and trees and we will be discussing these in some detail later in the class.

The use of abstraction means that you must write specifications for modules and abstract data types that describe the outside view (often called the public view). These specifications describe how other modules can interact with you, but don’t say anything about the implementation. Abstraction allows you to remove the details from public view. This is called information hiding. This principle says that you should not only keep the details private with a module, but also prevent other modules from affecting any of these hidden details. Information hiding is important to use in conjunction with abstraction. If other modules have access to the details of your module, they can become dependent on them or modify the data. This eliminates the benefits of abstraction, since the details of your module can no longer be changed without causing necessary changes in the other modules that use these details.

As mentioned above, abstract data types encapsulate both data and operations on the data in a way that provides a public interface, but are independent of any particular implementation. (ADTs are defined during software design. They are later implemented using data structures.) The interface is given in the specifications and can be thought of as a kind of contract that says: If you use the ADT according to the specified interface, it will perform the operations given in the specifications. Note that the code itself should contain the specification of this interface in terms of the preconditions and postconditions for each function. Abstraction of this sort (both data abstraction and functional abstraction) is the key to modularity, since it allows various modules, ADTs, and other functions to be written independently, without having to worry about the interaction with other design components. Note that an ADT is a like class of objects. There can be many different instances of the type. Each instance shares some properties, such as types of data and operations on the data, but they each represent different data.

C++ concepts

See the C++ notes linked on the course web site:

http://courses.washington.edu/css342/zander/css332/

In particular, the following pages:

The Rational class -- a better version -- header file

· const in parameter, const as a function

· return reference for += operator vs return a new object for + operator

· "friends" with iostream,

o What does "friends" mean?

o notice the syntax

o how does java solve this problem?

The Rational class -- a better version -- .cpp

· The "this" _pointer_

· Definition of ">>" operators, notice you are defining functions for someone else!

Using Rational objects -- better version -- sample driver

· Creating copies from existing ones

· Type casting, auto attempts by the compiler

· "-=" the declared operator that is never implemented. Notice this is ok, as long as you don't attempt to call the function!

Templates

In many cases, we would like to write a function so that it would apply to many different classes of objects. Examples include min(a, b) , swap(a, b) or sort(array, n). Here is a simple example of min:

template <typename Comparable>

Comparable min(Comparable a, Comparable b) {

if (a < b) return a;

else return b;

}

template <typename Comparable>

const Comparable &min(const Comparable &a, const Comparable &b) {

if (a < b) return a;

else return b;

}

See the Complex number class on the course website for a template class example:

Complex.h

Complex.cpp

ComplexDriver.cpp

Note that templates are difficult to handle for compilers, since they are compiled on demand. The declaration and definition need to be compiled at the same time. This does not mean that they need to be in the same file. Using a #include to include the implementation file at the end of the header file is one option.