In section 4.4.4 we will present an
implementation of the query interpreter as a collection of
In this section we give an overview that explains the general
structure of the system independent of low-level implementation
details. After describing the implementation of the interpreter, we
will be in a position to understand some of its limitations and some
of the subtle ways in which the query language's logical operations
differ from the operations of mathematical logic.
It should be apparent that the query evaluator must perform some kind
of search in order to match queries against facts and rules in the
data base. One way to do this would be to implement the query system
as a nondeterministic program, using the amb evaluator of
section 4.3 (see
exercise 4.69). Another possibility is to manage
the search with the aid of streams. Our implementation follows this
The query system is organized around two central operations called
pattern matching and unification. We first describe
pattern matching and explain how this operation, together with the
organization of information in terms of streams of frames, enables us
to implement both simple and compound queries. We next discuss
unification, a generalization of pattern matching needed to implement
rules. Finally, we show how the entire query interpreter fits
together through a
that classifies expressions in a manner
analogous to the way
classifies expressions for the
interpreter described in section 4.1.
A pattern matcher is a program that tests whether some datum
fits a specified pattern. For example, the data list list(list("a", "b"), "c", list("a", "b")) matches the pattern list(x, "c", x) with the pattern variable
x bound to list("a", "b"). The same data list matches the pattern
list(x, y, z) with x and z both bound to list("a", "b")
and y bound to "c". It also matches the pattern list(list(x, y), "c", list(x, y)) with x bound to "a" and y bound
to "b". However, it does not match the pattern list(x, "a", y),
since that pattern specifies a list whose second element is
the string "a".
The pattern matcher used by the query system takes as inputs a
pattern, a datum, and a frame that specifies bindings for
various pattern variables. It checks whether the datum matches the
pattern in a way that is consistent with the bindings already in the
frame. If so, it returns the given frame augmented by any bindings
that may have been determined by the match. Otherwise, it indicates
that the match has failed.
For example, using the pattern list(x, y, z) to match list("a", "b", "c")
given an empty frame will return a frame specifying that x is
bound to "a" and y is bound to "b". Trying the match
with the same pattern, the same datum, and a frame specifying that
y is bound to "a" will fail. Trying the match with the
same pattern, the same datum, and a frame in which y is bound
to b and x is unbound will return the given frame
augmented by a binding of x to "a".
The pattern matcher is all the mechanism that is needed to process
simple queries that don't involve rules. For instance, to process the
job(x, list("computer", "programmer"));
we scan through all assertions in the data base and select those that
match the pattern with respect to an initially empty frame. For each
match we find, we use the frame returned by the match to instantiate
the pattern with a value for x.
Streams of frames
The testing of patterns against frames is organized through the use of
streams. Given a single frame, the matching process runs through the
data-base entries one by one. For each data-base entry, the matcher
generates either a special symbol indicating that the match has failed
or an extension to the frame. The results for all the data-base
entries are collected into a stream, which is passed through a filter
to weed out the failures. The result is a stream of all the frames
that extend the given frame via a match to some assertion in the data
In our system, a query takes an input stream of frames and performs
the above matching operation for every frame in the stream, as
indicated in Figure 4.4. That is, for each frame in
the input stream, the query generates a new stream consisting of all
extensions to that frame by matches to assertions in the data base.
All these streams are then combined to form one huge stream, which
contains all possible extensions of every frame in the input stream.
This stream is the output of the query.
To answer a simple query, we use the query with an input stream
consisting of a single empty frame. The resulting output stream
contains all extensions to the empty frame (that is, all answers to
our query). This stream of frames is then used to generate a stream
of copies of the original query pattern with the variables
instantiated by the values in each frame, and this is the stream that
is finally printed.
The real elegance of the stream-of-frames implementation is evident
when we deal with compound queries. The processing of compound
queries makes use of the ability of our matcher to demand that a match
be consistent with a specified frame. For example, to handle the and of two queries, such as
This produces a stream of frames, each of which contains a binding for
x. Then for each frame in the stream we find all entries that
in a way that is consistent with the given binding for x. Each
such match will produce a frame containing bindings for x and
person. The and of two queries can be viewed as a series
combination of the two component queries, as shown in
figure 4.5. The frames that pass through the first
query filter are filtered and further extended by the second query.
Figure 4.6 shows the analogous method for computing the
or of two queries as a parallel combination of the two component
queries. The input stream of frames is extended separately by each
query. The two resulting streams are then merged to produce the final
Even from this high-level description, it is apparent that the
processing of compound queries can be slow.
For example, since a query may produce more than one output frame for
each input frame, and each query in an and gets its input frames
from the previous query, an and query could, in the worst case,
have to perform a number of matches that is exponential in the number
of queries (see
Though systems for handling only simple queries are quite practical,
dealing with complex queries is extremely
From the stream-of-frames viewpoint, the not of some query acts
as a filter that removes all frames for which the query can be
satisfied. For instance, given the pattern
not(job(x, list("computer", "programmer")))
we attempt, for each frame in the input stream, to produce extension
frames that satisfy job(x, list("computer", "programmer")). We remove
from the input stream all frames for which such extensions exist. The
result is a stream consisting of only those frames in which the
binding for x does not satisfy job(x, list("computer", "programmer")). For example, in processing the query
the first clause will generate frames with bindings for x and
y. The not clause will then filter
these by removing all frames in which the binding for x
satisfies the restriction that x is a computer
on frame streams. We use each frame in the stream to instantiate any
from the input stream all frames for which the predicate fails.
In order to handle rules in the query language, we must be able to
find the rules whose conclusions match a given query pattern. Rule
conclusions are like assertions except that they can contain
variables, so we will need a generalization of pattern
matching—called unification—in which both the pattern
and the datum may contain variables.
A unifier takes two patterns, each containing constants and variables,
and determines whether it is possible to assign values to the
variables that will make the two patterns equal. If so, it returns a
frame containing these bindings. For example, unifying list(x, "a", y) and list(y, z, "a") will specify a frame in which x, y, and z must all be bound to "a". On the other
hand, unifying list(x, y, "a") and list(x, "b", y) will fail, because
there is no value for y that can make the two patterns equal.
(For the second elements of the patterns to be equal, y would
have to be "b"; however, for the third elements to be equal, y would have to be "a".) The unifier used in the query system,
like the pattern matcher, takes a frame as input and performs
unifications that are consistent with this frame.
The unification algorithm is the most technically difficult part of
the query system. With complex patterns, performing unification may
seem to require deduction. To unify list(x, x) and list(list("a", y, "c"), list("a", "b", z)), for example, the algorithm must infer that x should
be list("a", "b", "c"), y should be "b", and z should
be "c". We may think of this process as solving a set of
equations among the pattern components. In general, these are
simultaneous equations, which may require substantial manipulation to
solve. For example, unifying list(x, x) and list(list("a", y, "c"), list("a", "b", z)) may be thought of as specifying the
x $=$ list("a", y, "c")
x $=$ list("a", "b", z)
These equations imply that
list("a", y, "c") $=$ list("a", "b", z)
which in turn implies that
"a" $=$ "a", y $=$ "b", "c" $=$ z,
and hence that
x $=$ list("a", "b", "c")
In a successful pattern match, all pattern variables become bound, and
the values to which they are bound contain only constants. This is
also true of all the examples of unification we have seen so far. In
general, however, a successful unification may not completely
determine the variable values; some variables may remain unbound and
others may be bound to values that contain variables.
Consider the unification of list(x, "a") and list(list("b", y), z). We
can deduce that x $=$ list("b", y) and "a" $=$ z, but we cannot
further solve for x or y. The unification doesn't fail,
since it is certainly possible to make the two patterns equal by
assigning values to x and y. Since this match in no way
restricts the values y can take on, no binding for y is
put into the result frame. The match does, however, restrict the
value of x. Whatever value y has, x must be list("b", y). A binding of x to the pattern list("b", y) is thus
put into the frame. If a value for y is later determined and
added to the frame (by a pattern match or unification that is required
to be consistent with this frame), the previously bound x will
refer to this value.
Unification is the key to the component of the query system that makes
inferences from rules. To see how this is accomplished, consider
processing a query that involves applying a rule, such as
lives_near(x, list("Hacker", "Alyssa", "P"));
To process this query, we first use the ordinary pattern-match
described above to see if there are any assertions in the
data base that match this pattern. (There will not be any in this
case, since our data base includes no direct assertions about who
lives near whom.) The next step is to attempt to unify the query
pattern with the conclusion of each rule. We find that the pattern
unifies with the conclusion of the rule
resulting in a frame specifying that person_2 is bound
to list("Hacker", "Alyssa", "P") and that x should be bound to (have
the same value as) person_1. Now, relative to this frame, we
evaluate the compound query given by the body of the rule. Successful
matches will extend this frame by providing a binding for person_1, and consequently a value for x, which we can use to
instantiate the original query pattern.
In general, the query evaluator uses the following method to apply a
rule when trying to establish a query pattern in a frame that
specifies bindings for some of the pattern variables:
Unify the query with the conclusion of the rule to form, if
successful, an extension of the original frame.
Relative to the extended frame, evaluate the query formed by
the body of the rule.
Notice how similar this is to the method for applying a
function's parameters to its arguments to form a
frame that extends the original
Relative to the extended environment, evaluate the expression
formed by the body of the
The similarity between the two evaluators should come as no surprise.
rule definitions are the means of abstraction in the query language.
In each case, we unwind the abstraction by creating appropriate
bindings and evaluating the rule or
body relative to these.
We saw earlier in this section how to evaluate simple queries in the
absence of rules. Now that we have seen how to apply rules, we can
describe how to evaluate simple queries by using both rules and
Given the query pattern and a stream of frames, we produce, for each
frame in the input stream, two streams:
a stream of extended frames obtained by matching the pattern
against all assertions in the data base (using the pattern matcher),
a stream of extended frames obtained by applying all
possible rules (using the unifier).
Appending these two streams produces a stream that consists of all the
ways that the given pattern can be satisfied consistent with the
original frame. These streams (one for each frame in the input
stream) are now all combined to form one large stream, which therefore
consists of all the ways that any of the frames in the original input
stream can be extended to produce a match with the given pattern.
The query evaluator and the driver loop
Despite the complexity of the underlying matching operations, the
system is organized much like an evaluator for any language. The
that coordinates the matching operations is called
qeval, and it plays a role analogous to that of the evaluate
of frames. Its output is a stream of frames, corresponding to
successful matches to the query pattern, that extend some frame in the
input stream, as indicated in Figure 4.4. Like
evaluate, qeval classifies the different types of expressions
(queries) and dispatches to an appropriate
for each. There
for each special form (and, or, not,
and one for simple queries.
The driver loop, which is analogous to the
for the other evaluators in this chapter, reads queries from the
terminal. For each query, it calls qeval with the query and a
stream that consists of a single empty frame. This will produce the
stream of all possible matches (all possible extensions to the empty
frame). For each frame in the resulting stream, it instantiates the
original query using the values of the variables found in the frame.
This stream of instantiated queries is then
The driver also checks for the special command assert, which
signals that the input is not a query but rather an assertion or rule
to be added to the data base. For instance,
Because matching is generally very expensive, we would
like to avoid applying the full matcher to every element of the data
base. This is usually arranged by breaking up the process into a
fast, coarse match and the final match. The coarse match filters the
data base to produce a small set of candidates for the final match.
With care, we can arrange our data base so that some of the work of
coarse matching can be done when the data base is constructed rather
then when we want to select the candidates. This is called
indexing the data base. There is a vast technology built around
data-base-indexing schemes. Our implementation, described in
section 4.4.4, contains a
simple-minded form of such an optimization.
But this kind of exponential explosion is not common in and
queries because the added conditions tend to reduce rather than expand
the number of frames produced.
There is a large literature on data-base-management
systems that is concerned with how to handle complex queries
There is a subtle difference between this filter
implementation of not and the usual meaning of not in
mathematical logic. See section 4.4.3.
In one-sided pattern matching, all the equations that
contain pattern variables are explicit and already solved for the
unknown (the pattern variable).
Another way to think of unification is that it generates the most
general pattern that is a specialization of the two input patterns.
That is, the unification of list(x, "a") and list(list("b", y), z) is list(list("b", y), "a"), and the unification of list(x, "a", y) and list(y, z, "a"), discussed above, is list("a", "a", "a").
For our implementation, it is more convenient to think of the result
of unification as a frame rather than a pattern.
Since unification is a
generalization of matching, we could simplify the system by using the
unifier to produce both streams. Treating the easy case with the
simple matcher, however, illustrates how matching (as opposed to
full-blown unification) can be useful in its own right.
The reason we use streams (rather than lists) of frames is that the
recursive application of rules can generate
infinite numbers of values that satisfy a query. The delayed
evaluation embodied in streams is crucial here: The system will print
responses one by one as they are generated, regardless of whether
there are a finite or infinite number of responses.