LISPy things you can do in 64K bytes of core

Meta-Circular Evaluation

In the proceedings of the HOPL II conference of 1978, John McCarthy presented a beautiful "Micro-Manual for LISP" that included a completely self-contained interpreter for LISP I, written in LISP I [8]. A Kilo LISP variant of McCarthy's meta-circular EVAL function has a length of 66 lines and allocates about 443 nodes of storage.

Note that the first LISP implementations used a binding form called LABEL (not LABELS) of the form

(label <name> <lambda form>)

that would return a recursive lambda function, i.e. a lambda form that was bound to the given name. The above LABEL form would be equivalent to the Kilo LISP form

(labels ((<name> <lambda form>)) <name>)

Hence it is easily implemented using a simple macro:

(setq label   
    (lambda (x f)  
      @(labels ((,x ,f)) ,x))))

Given this macro, the fully self-contained EVAL function can be used to interpret LISP programs that are limited to the following forms and special forms:

In addition, the following built-in functions may be used:


The EVAL function itself is what is called a tree walker these days: it traverses a program in the form of an S-expression using CAR, CDR, and EQ. It implements the primitive functions of the language it implements by delegating their applications to its own primitive functions. E.g., the program

(cons (quote a) (quote b))

would be evaluated in an environment E by first evaluating the arguments and then applying cons to their values:

(cons (eval (cadr x) e)
      (eval (cadr (cdr x)) e)))

Note that it is the code implementing CONS that evaluates the arguments of CONS. Hence special forms can be implemented by simply not evaluating arguments, or by evaluating them conditionally, like in the implementation of COND:

(label evcon
  (lambda (c e)
    (cond ((eval (caar c) e)
            (eval (cadar c) e))
          (t (evcon (cdr c) e)))))

Here is a simple program that can be interpreted by EVAL. It appends the lists (A B C) and (D E F) using an APPEND function written in the subset of LISP described above:

(setq prog
  (quote            ; note the quote!
    ((label append
            (lambda (a b)
              (cond ((eq a nil) b)
                    (t (cons (car a)
                             (append (cdr a)
     (quote (a b c))
     (quote (d e f)))))

Note that the program is quoted! The variable PROG is bound to an S-expression representing the application of the LABEL expression named APPEND to the lists (A B C) and (D E F). Interpretation of the program is performed by EVAL:

* (load 'src//eval)

* (eval prog '((t t)))
(a b c d e f)

* _

The second argument of EVAL is the environment (in the shape of an association list) in which the first argument will be evaluated. The code uses a clever trick to implement the generic truth values T and NIL: T is an ordinary variable that is passed to EVAL in its environment. NIL is what ASSOC returns when a variable cannot be found in an environment and there is no variable named NIL.

What is particularly interesting about EVAL is, of course, that the implementation of EVAL itself is limited to the subset of LISP that it implements. It is a meta-circular interpreter; it can evaluate its own code.

To interpret EVAL, the source code of EVAL is needed, which is the S-expression that the LISP reader (and compiler, if there is one) will turn into into an executable program. So the code of EVAL is just one QUOTE away – It is bound to the symbol EVSRC. The only difference between EVSRC and EVAL is that EVSRC is quoted.

On a LISP system that has a built-in EVAL, that built-in EVAL could be used to compile EVSRC, i.e.

(setq eval2 (eval evsrc))

but Kilo LISP does not have a built-in EVAL, so there have to be two versions of the meta-circular EVAL function: a quoted one (EVSRC) to be interpreted by EVAL and an unquoted one (EVAL) to be interpreted by Kilo LISP.

The expression

(eval @(,evsrc ',prog '((t t))) '((t t)))

evaluates EVAL evaluating EVAL evaluating the program PROG. The @ character is an abbreviation for quasiquotation, just like the backquote in most other LISPs.

There are five levels of interpretation here:

  1. the inner EVAL interpreting PROG
  2. the outer EVAL interpreting the inner EVAL
  3. Kilo LISP interpreting the outer EVAL
  4. the CPU interpreting the Kilo LISP machine code
  5. the laws of nature interpreting the CPU

The difference between four and five layers of evaluation is significant (measured on a 4.77MHz IBM PC for dramatic effect):

* (eval prog '((t t)))
(a b c d e f)            ; 8 seconds

* (eval @(,evsrc ',prog '((t t))) '((t t)))
(a b c d e f)            ; 14 minutes, 20 seconds

* _

So what about EVAL3 — one more level of interpretation?

* (eval @(,evsrc '(,evsrc ',prog '((t t))) '((t t))) '((t t)))

; time passes

; more time passes

Alright, let's try this on a 750MHz supercomputer:

* (eval @(,evsrc '(,evsrc ',prog '((t t))) '((t t))) '((t t)))
(a b c d e f)         ; some 25 minutes later

* _

So while EVAL3 still evaluates, it is certainly not a fun thing to try, even on modern hardware.

When run times grow like this, the problem is often memory rather than clock speed. When the amount of available LISP nodes goes toward zero, the fraction of time spent in the garbage collector will tend toward infinity (but, of course, the system will run out of nodes before infinity is reached). In the worst case, there would be one free cons cell that gets allocated over and over again, so that each CONS operation would trigger a garbage collection.

Indeed, maxing out the Kilo LISP node space (32760 nodes) reduces the time to evaluate EVAL3 from 25 minutes to 10 minutes, but this is no longer a LISPy thing you can do in 64K bytes of core!

contact  |  privacy