README - Book Version

This source file is part of the SubC compiler, which is described in the book

Practical Compiler Construction.

You might prefer to download the compiler source code. It is in the public domain.

        SubC Compiler, Version 2022-05-02
        By Nils M Holm, 2011--2022

        Placed in the public domain.
        Where the concept of the public domain does not apply,
        distributed under the Creative Commons Zero (CC0) license
        (see file CC0).


        SubC is a compiler for a (mostly) strict and sane subset of
        C as described in "The C Programming Language", 2nd Ed.
        The language is also known informally as "ANSI C" or "C89".

        The compiler is described in great detail in the book
        "Practical Compiler Construction" (2nd Ed.), which can be
        purchased at See

        This archive contains the SubC compiler as discussed in the
        second edition of the book. It contains many fixes that have
        been collected since the publication of the first edition.
        See the file "Fixes" for details.

        The SubC compiler can compile itself. Unlike many other small C
        compilers, it does not bend the rules, though. Its code passes
        "gcc -Wall -pedantic" with little or no warnings (depending on
        the gcc version).

        The compiler generates code for GAS-386, the GNU assembler
        for the 386 processor. Its runtime environment is designed
        to run on FreeBSD systems, but it should be easy to port to
        other unixish 32-bit systems. Even porting it to a 64-bit
        platform should not be too hard.

        SubC is fast and simple. Its output is typically small (due
        to a non-bloated library), but not very runtime efficient,
        because it employs none of the code synthesis or optimization
        strategies explained in the book.


        The root directory of this archive contains the source code for
        the SubC compiler itself.

        lib/    contains the source code for the SubC library (libscc)
                and the runtime startup module (crt0).

        include/  contains the header files of the SubC library.

        book/   contains source code for the code generators and
                optimizers discussed in the later chapters of the book.
                It also contains a YACCable version of the formal
                grammar that is used to describe SubC in the book.

        tests/  contains various programs for testing the correctness
                of SubC.


        (From Practical Compiler Construction)

        o  The following keywords are not recognized:
           auto, const, double, float, goto, long, register, short,
           signed, struct, typedef, union, unsigned, volatile.

        o  There are only two data types: the signed int and the
           unsigned char; there are also void pointers, and there
           is limited support for int(*)() (pointers to functions
           of type int).

        o  No more than two levels of indirection are supported, and
           arrays are limited to one dimension, i.e. valid declarators
           are limited to x, x[], *x, *x[], **x (and (*x)()).

        o  K&R-style function declarations (with parameter
           declarations between the parameter list and function body)
           are not accepted.

        o  There are no ``register'', ``volatile'', or ``const''
           variables. No register allocation takes place, so all
           variables are implicitly ``volatile''.

        o  There is no typedef.

        o  There are no unsigned integers and no long integers.

        o  There are no struct or union data types.

        o  Only int, char and array of int and char types can be
           initialized in their declarations; pointers can be
           initialized with 0 (but not with NULL).

        o  The enum statement does not define a new data type. but
           merely a set of constants.

        o  Local arrays cannot have initializers.

        o  There are no local extern qualifiers or enum statements.

        o  Local declarations are limited to the beginnings of function
           bodies (they do not work in other compound statements).

        o  There are no static prototypes.

        o  Arguments of prototypes must be named.

        o  There is no goto.

        o  There are no parameterized macros.

        o  The #error, #if, #line, and #pragma
           preprocessor commands are not recognized.

        o  The preprocessor does not recognize the # and ## operators.

        o  There may not be any blanks between the # that introduces
           a preprocessor command and the subsequent command (e.g.:
           "# define" would not be recognized as a valid command).

        o  Comments in preprocessor commands will cause trouble in
           many cases and should be avoided.

        o  Preprocessor commands cannot span multiple lines, i.e. a
           "\" at the end of a line will not continue a preprocessor

        o  The sizeof operator is limited to types and single
           identifiers; the operator requires parentheses.

        o  The address of an array must be specified as "&array[0]"
           instead of "&array" (but just "array" also works).

        o  Subscripting an integer with a pointer (e.g. 1["foo"]) is
           not supported.

        o  Function pointers are limited to one single type, int(*)(),
           and they have no argument types.

        o  There is no assert() due to the lack of parameterized macros.

        o  The atexit() mechanism is limited to one function (this may
           even be covered by TCPL2).

        o  Environments of setjmp() have to be defined as
           int[_JMPBUF_SIZ]; instead of jmp_buf due to the lack of

        o  FILE is an alias of int due to the lack of typedef and struct.

        o  The signal() function is missing. When included by linking
           against the host's C library, it returns int due to the lack
           of a more sophisticated type system; the return value must be
           casted to int(*)().

        o  Most of the time-related functions are missing due to the lack
           of structs; in particular: asctime(), gmtime(), localtime(),
           mktime(), and strftime().

        o  The clock() function is missing, because CLOCKS_PER_SEC
           varies among systems.

        o  The ctime() function ignores the time zone.


        On a FreeBSD system just type "make".

        Without "make" the compiler can be bootstrapped by running:

        cc -o scc0 *.c

        To compile and package the runtime library:

        ./scc0 -c lib/*.c
        ar -rc lib/libscc.a lib/*.o
        ranlib lib/libscc.a

        To compile the startup module:

        as -o lib/crt0.o lib/crt0.s


        To test the compiler either run "make test" or perform the
        following steps:

        ./scc0 -o scc1 *.c
        ./scc1 -o scc *.c
        cmp scc1 scc

        There should not be any differences between the scc1 and scc

        There is a simple test suite in the tests/ directory. To run
        it, type "make test-all". Alternatively, individual tests can
        be compiled and run separately, e.g.:

        ./scc tests/ptest.c && ./a.out && rm a.out


        If you want to install the SubC compiler on your system, you
        will have to change the SCCDIR variable, which points to the
        base directory containing the SubC headers and runtime library.
        SCCDIR defaults to "." and can be overridden on the command
        line when compiling the compiler:

        ./scc1 -o scc -D 'SCCDIR="INSTALLDIR"' *.c

        (where INSTALLDIR is where the compiler will be installed.)

        You can place the 'scc' executable wherever you want. The
        headers go to INSTALLDIR/include, the library 'lib/libscc.a'
        and the startup module 'lib/crt0.o' go to INSTALLDIR/lib.

        To test the installation just re-compile the compiler:

        rm scc && scc -o scc *.c


        Send feedback, suggestions, etc to:

        n m h @ t 3 x . o r g

        See for current ways through my
        spam filter.

contact  |  privacy