LISPy things you can do in 64K bytes of core

Logic Programming

No discussion of LISP would be complete without an implementation of PROLOG. Naturally, an interpreter that can be fit in less than 7K nodes of LISP memory cannot be very complex. The one discussed here is probably as small and basic as a PROLOG interpreter can get. It is based on a tiny PROLOG written by Ken Kahn in 1983 [7]. It originally ran in MACLISP on a PDP-10. The size of the interpreter is really surprising: less than 100 lines of code, about 2.5K bytes, and it allocates less than 900 nodes of memory.

PROLOG is interesting, because it performs logical inference and backtracking, i.e. it will derive facts that were not explicitly given to it and it will systematically find all solutions to a given problem.

The following programs will be presented as databases of "clauses" (facts and rules) and all expressions submitted to PROLOG will be assumed to be queries. For example,

(quote ((man socrates)))

would be a "fact" that states that Socrates is a man, and

(quote ((mortal (// x)) (man (// x))))

would be a "rule" stating that "X is mortal, if X is a man". A rule is a sequence of "terms", where the first one is the "head" of the rule and and the subsequent ones, called "goals", form the "body" of the rule. A rule basically says, in order to prove the head, prove all goals of the body. Using PROLOG syntax, the above program would look like this:

mortal(X) :- man(X).

A "fact" is just a rule with an empty body: nothing needs to be proven for the rule to be true.

Given a database containing the above rules, queries can be submitted to PROLOG. (Below, Themistoclea is added to the database, who is mortal, but not a man.)

* (load 'src//prolog)

* (setq db '(((man socrates))
             ((mortal themistoclea))
             ((mortal (// x)) (man (// x)))))

* (prolog db '(man socrates))
                                   ; mind the gap!

* (prolog db '(man aristotle))
nil                                ; no blank line here!

* (prolog db '(mortal (// who)))

who = themistoclea

who = socrates

* _

The first query, (man socrates), asks the system, "is Socrates a man?" and PROLOG answers "yes" by printing a blank line. Yes, the interface is minimalistic! The next query asks if Aristotle is a man, to which the system responds "no" (by not printing a blank line). Of course, Aristotle is a man, but this cannot be inferred from the rules in the database, so the answer is no.

The final query asks "who is mortal?", where the expression (// who) denotes the variable "who". The query delivers the answers "who = themistoclea" and "who = socrates". The fact that Themistoclea is mortal is contained in the database, but the fact that Socrates is mortal was inferred from the fact that he is a man and the rule stating that men are mortal.

The basic functionality of the PROLOG system is contained in four functions of about 30 lines of code in total. The UNIFY function attempts to unify two terms. Unification is a little bit like pattern matching: it compares two terms, which are LISP lists, and binds variables contained in one term to the corresponding counterparts in the other term. UNIFY is written in LISP, so it can be explored interactively:

* (unify '(foo bar baz)
         '(foo bar baz) '((bottom)))

* (unify '(foo bar baz)
         '(foo baz wrong) '((bottom)))

* (unify '((// foo)  bar)
         '(foo-value bar) '((bottom)))
(((/ foo) foo-value) (bottom))

* (unify '((// foo) (// foo))
         '(first    second)   '((bottom)))

The third argument of UNIFY is an environment in which variables are bound to values. ((Bottom)) is the empty environment. When the two terms of UNIFY match, it returns a new environment, and otherwise it returns NIL. In the third example above, unification succeeds and binds FOO to FOO-VALUE. The last example first binds FOO to FIRST and then attempts to bind FOO to SECOND — but FOO is already bound to FIRST, so this is a contradiction and unification fails.

Variables can bind to variables, too, making those variables the same (or, more correctly, making them co-refer to the same value):

* (unify '((// foo))
         '((// bar)) '((bottom)))
(((/ foo) (/ bar)) (bottom))

* (unify '((// foo) (// bar))
         '(first    second)
	 '(((// foo) (// bar)) (bottom)))

Here the second unification fails, because it binds FOO and BAR to different values, but the environment says that FOO and BAR co-refer, i.e. they must refer to the same value.

The TRY function attempts to prove a clause (GOALS) in an environment (E) given a database (DB). The variable RULES points to a clause in the database. N is for renaming variables, but explaining this at this point would be a major detour.

What TRY basically does is this: unify the head of the first rule R1 in the database with the head of the given clause (GOALS). When unification fails, try the next rule. When unification succeeds, giving a new environment NE, try to prove the concatenation of the bodies of R1 and GOALS in NE. When the proof does not succeed, try the next rule in the database. When running out of rules, return NIL.

The PROVE function just passes its variables to TRY, but in addition it prints the current "frame" when an empty clause is passed to it. A "frame" is a list of all instantiations (bindings) of the variables contained in a query. An empty clause means that there is nothing to prove, so the values bound to the variables form a valid solution.

The PROLOG function merely passes the given database and query (GOALS) to PROVE and provides some initial values, such as an empty environment.

Here are some more queries, this time using the genealogy database from the original source code:

* ; who is the father of Ken?
  (prolog db '(father (// who) ken))
who = jack

* ; who is the grandchild of Cele?
  (prolog db '(grandparent cele (// who)))

who = ken

who = karen

* ; who is parent to whom?
  (prolog db '(parent (// parent) (// child)))
child = ken
parent = el
child = jack
parent = cele
child = ken
parent = jack
child = karen
parent = jack

The last query is particularly interesting, because each frame printed as a solution contains multiple variables, i.e. the PROLOG system prints all valid parent/child combinations that can be inferred from the clauses in the database.

A minor modification allows the tiny PROLOG system to operate on lists, thereby supporting clauses like this:

; append([], L, L).
; append([X|Xs], Ys, [X|Zs]) :- append(Xs, Ys, Zs).

(setq db
  '(((append nil (// l) (// l)))
    ((append ((// x) . (// xs)) (// ys) ((// x) . (// zs)))
     (append (// xs) (// ys) (// zs)))))

The APPEND goal appends the lists X and Y, instantiating Z with the result. Being a PROLOG goal, though, there are even more interesting things you can do with APPEND:

; what is '(a b c) concatenated to '(d e f)?
* (prolog db '(append (a b c) (d e f) (// list)))

list = (a b c d e f)

; what prepended to '(d e f) gives '(a b c d e f)?
* (prolog db '(append (// what) (d e f) (a b c d e f)))

what = (a b c)

; what appended to what gives '(a b c)?
(prolog db '(append (// a) (// b) (a b c)))
b = (a b c)
a = nil
b = (b c)
a = (a)
b = (c)
a = (a b)
b = nil
a = (a b c)

So a basic PROLOG system can be implemented in 7K nodes of LISP memory with lots of leeway. What about LISP itself?

contact  |  privacy