We began the rational-number implementation in
section 2.1.1 by implementing the rational-number
operations `add_rat`, `sub_rat`, and so on in terms of three
unspecified
functions: `make_rat`, `numer`, and `denom`.
At that point, we could think of the operations as being defined in
terms of data objects—numerators, denominators, and rational
numbers—whose behavior was specified by the latter three
functions.

But exactly what is meant by *data*? It is not enough to say
whatever is implemented by the given selectors and constructors.

Clearly, not every arbitrary set of three
functions
can serve as an
appropriate basis for the rational-number implementation. We need to
guarantee that,
if we construct a rational number `x` from a pair
of integers `n` and `d`, then extracting the `numer` and the
`denom` of `x` and dividing them should yield the same result
as dividing `n` by `d`.
In other words, `make_rat`,
`numer`, and `denom` must satisfy the condition that, for any
integer `n` and any non-zero integer `d`, if `x` is
`make_rat(n,d)`, then

In fact, this is the only condition `make_rat`, `numer`, and
`denom` must fulfill in order to form a suitable basis for a
rational-number representation. In general, we can think of data as
defined by some collection of selectors and constructors, together
with specified conditions that these
functions
must fulfill in order
to be a valid representation.[1]

This point of view can serve to define not only high-level

data
objects, such as rational numbers, but lower-level objects as well.
Consider the notion of a pair, which we used in order to define our
rational numbers. We never actually said what a pair was, only that
the language supplied
functions
`pair`, `head`, and `tail`
for operating on pairs. But the only thing we need to know about
these three operations
is that if we glue two objects together using
`pair` we can retrieve the objects using `head` and `tail`.
That is, the operations satisfy the condition that, for any objects
`x` and `y`, if `z` is `pair(x, y)` then `head(z)`
is `x` and `tail(x)` is `y`. Indeed, we mentioned that
these three
functions
are included as primitives in our language.
However, any triple of
functions
that satisfies the above condition
can be used as the basis for implementing pairs. This point is
illustrated strikingly by the fact that we could implement `pair`,
`head`, and `tail` without using any data structures at all but
only using
functions. Here are the definitions:The function
`Error` is similar to the function
`error` but takes a second argument,
which is converted to a string and then concatenated with the first argument.

function pair(x, y) { return m => m === 0 ? x : m === 1 ? y : Error("Argument not 0 or 1 in pair", m); } function head(z) { return z(0); } function tail(z) { return z(1); }

This use of functions corresponds to nothing like our intuitive notion of what data should be. Nevertheless, all we need to do to show that this is a valid way to represent pairs is to verify that these functions satisfy the condition given above.

The subtle point to notice is that the value returned by `pair(x, y)` is a
function—namely the internally defined
function
`dispatch`, which takes one argument and returns either `x` or `y` depending on whether the argument is 0 or 1. Correspondingly, `head(z)` is defined to apply `z` to 0. Hence, if `z` is the
function
formed by `pair(x, y)`, then `z` applied to 0 will
yield `x`. Thus, we have shown that `head(pair(x, y))` yields
`x`, as desired. Similarly, `tail(pair(x, y))` applies the
function
returned by `pair(x, y)` to 1, which returns `y`.
Therefore, this
functional
implementation of pairs is a valid
implementation, and if we access pairs using only `pair`, `head`, and `tail` we cannot distinguish this implementation from one
that uses real

data structures.

The point of exhibiting the
functional
representation of pairs is not
that our language works this way
(we will be using arrays to represent pairs)
but that it could
work this way. The
functional
representation, although obscure, is a
perfectly adequate way to represent pairs, since it fulfills the only
conditions that pairs need to fulfill. This example also demonstrates
that the ability to manipulate
functions
as objects automatically
provides the ability to represent compound data. This may seem a
curiosity now, but
functional
representations of data will play a
central role in our programming repertoire. This style of programming
is often called
*message passing*, and we will be using it as a
basic tool in chapter 3 when we address the issues of modeling and
simulation.

function pair(x, y) { return m => m(x, y); } function head(z) { return z((p, q) => p); }

function tail(z) { return z((p,q) => q); }

function pair(a, b) { return fast_expt(2, a) * fast_expt(3, b); } function head(p) { return p % 2 === 0 ? head(p / 2) + 1 : 0; } function tail(p) { return p % 3 === 0 ? tail(p/3) + 1 : 0; }

const zero = f => x => x; function add_1(n) { return f => x => f(n(f)(x)); }

function one() { return f => x => f(x); } function two() { return f => x => f(f(x)); } function plus(n, m) { return f => x => n(f)(m(f))(x); }

[1] Surprisingly, this idea is very
difficult to formulate rigorously. There are two approaches to giving
such a formulation. One, pioneered by
C. A. R. Hoare (1972), is known
as the method of
*abstract models*. It formalizes the
Thatcher, Wagner, and Wright 1978 ), and by
Guttag at Toronto (see Guttag 1977 ),
is called
*algebraic specification*. It regards the

functions plus conditionsspecification as outlined in the rational-number example above. Note that the condition on the rational-number representation was stated in terms of facts about integers (equality and division). In general, abstract models define new kinds of data objects in terms of previously defined types of data objects. Assertions about data objects can therefore be checked by reducing them to assertions about previously defined data objects. Another approach, introduced by Zilles at MIT, by Goguen, Thatcher, Wagner, and Wright at IBM (see

functionsas elements of an abstract algebraic system whose behavior is specified by axioms that correspond to our

conditions,and uses the techniques of abstract algebra to check assertions about data objects. Both methods are surveyed in the paper by Liskov and Zilles (1975).

2.1.3
What Is Meant by Data?