We began the rational-number implementation in
section 2.1.1 by implementing the
rational-number operations
`add_rat`,
`sub_rat`,
and so on in terms of three unspecified
functions:
`make_rat`,
`numer`, and
`denom`. At that point, we could think of the
operations as being defined in terms of data objects—numerators,
denominators, and rational numbers—whose behavior was specified
by the latter three
functions.

But exactly what is meant by *data*? It is not enough to say
whatever is implemented by the given selectors and
constructors.

Clearly, not every arbitrary set of three
functions
can serve as an appropriate basis for the rational-number
implementation. We need to guarantee that,
if we construct a rational number `x` from a
pair of integers `n` and
`d`, then extracting the
`numer` and the
`denom` of `x` and
dividing them should yield the same result as dividing
`n` by `d`. In
other words,
`make_rat`,
`numer`, and
`denom` must satisfy the condition that, for
any integer `n` and any non-zero integer
`d`, if `x` is
`make_rat(n,d)`,
then
\[
\frac{\texttt{numer}(\texttt{x})}{\texttt{denom}(\texttt{x})}
=
\frac{\texttt{n}}{\texttt{d}}
\]

In fact, this is the only condition
`make_rat`,
`numer`, and
`denom` must fulfill in order to form a
suitable basis for a rational-number representation. In general, we can
think of data as defined by some collection of selectors and
constructors, together with specified conditions that these
functions
must fulfill in order to be a valid
representation.[1]

This point of view can serve to define not only
high-level

data objects, such as rational numbers, but
lower-level objects as well.
Consider the notion of a pair, which we used in order to define our
rational numbers. We never actually said what a pair was, only that
the language supplied
functions
`pair`,
`head`,
and
`tail`
for operating on pairs. But the only thing we need to know about these
three operations
is that if we glue two objects together using
`pair`
we can retrieve the objects using
`head`
and
`tail`.
That is, the operations satisfy the condition that, for any objects
`x` and `y`, if
`z` is
`pair(x, y)`
then
`head(z)`
is `x` and
`tail(x)`
is `y`. Indeed, we mentioned that these three
functions
are included as primitives in our language. However, any triple of
functions
that satisfies the above condition can be used as the basis for
implementing pairs. This point is illustrated strikingly by the fact
that we could implement
`pair`,
`head`,
and
`tail`
without using any data structures at all but only using
functions.
Here are the definitions:

function pair(x, y) { return m => m === 0 ? x : m === 1 ? y : error(m, "Argument not 0 or 1 in pair"); } function head(z) { return z(0); } function tail(z) { return z(1); }

This use of functions corresponds to nothing like our intuitive notion of what data should be. Nevertheless, all we need to do to show that this is a valid way to represent pairs is to verify that these functions satisfy the condition given above.

The subtle point to notice is that the value returned by
`pair(x, y)`
is a
function—namely
the internally defined
function
`dispatch`, which takes one argument and returns
either `x` or `y`
depending on whether the argument is 0 or 1. Correspondingly,
`head(z)`
is defined to apply `z` to 0. Hence, if
`z` is the
function
formed by
`pair(x, y)`,
then `z` applied to 0 will yield
`x`. Thus, we have shown that
`head(pair(x, y))`
yields `x`, as desired. Similarly,
`tail(pair(x, y))`
applies the
function
returned by
`pair(x, y)`
to 1, which returns `y`.
Therefore, this
functional
implementation of pairs is a valid
implementation, and if we access pairs using only
`pair`,
`head`,
and
`tail`
we cannot distinguish this implementation from one that uses
real

data structures.

The point of exhibiting the
functional
representation of pairs is not that our language works this way
(we will be using arrays to represent pairs)
but that it could work this way. The
functional
representation, although obscure, is a perfectly adequate way to represent
pairs, since it fulfills the only conditions that pairs need to fulfill.
This example also demonstrates that the ability to manipulate
functions
as objects automatically provides the ability to represent compound data.
This may seem a curiosity now, but
functional
representations of data will play a central role in our programming
repertoire. This style of programming is often called
*message passing*, and we will be using it as a basic tool in
chapter 3 when we address the issues of modeling and simulation.

function pair(x, y) { return m => m(x, y); } function head(z) { return z((p, q) => p); }

function tail(z) { return z((p,q) => q); }

function pair(a, b) { return fast_expt(2, a) * fast_expt(3, b); } function head(p) { return p % 2 === 0 ? head(p / 2) + 1 : 0; } function tail(p) { return p % 3 === 0 ? tail(p/3) + 1 : 0; }

const zero = f => x => x; function add_1(n) { return f => x => f(n(f)(x)); }

const one = f => x => f(x); const two = f => x => f(f(x)); function plus(n, m) { return f => x => n(f)(m(f)(x)); } // testing const three = plus(one, two); function church_to_number(c) { return c(n => n + 1)(0); } church_to_number(three);

[1]
Surprisingly, this idea is very difficult to
formulate rigorously. There are two approaches to giving such a
formulation. One, pioneered by
C. A. R. Hoare (1972), is known as the method of
*abstract models*. It formalizes the
Thatcher, Wagner, and Wright
1978 ), and by
Guttag at Toronto (see Guttag 1977 ),
is called
*algebraic specification*. It regards the

functions plus conditionsspecification as outlined in the rational-number example above. Note that the condition on the rational-number representation was stated in terms of facts about integers (equality and division). In general, abstract models define new kinds of data objects in terms of previously defined types of data objects. Assertions about data objects can therefore be checked by reducing them to assertions about previously defined data objects. Another approach, introduced by Zilles at MIT, by Goguen, Thatcher, Wagner, and Wright at IBM (see

functionsas elements of an abstract algebraic system whose behavior is specified by axioms that correspond to our

conditions,and uses the techniques of abstract algebra to check assertions about data objects. Both methods are surveyed in the paper by Liskov and Zilles (1975).

2.1.3 What Is Meant by Data?