[back]

[Cute Anecdote] Bootstrapping: self-compiling compilers

Notice that every compiler involves 3 programming languages:

It is sensible for the object language and implementation language to be the same, although typically the object language is a machine language or assembly language, and the implementation language is a higher-level language compiled into the same object language. Usually, the compiler and its object code are intended to run on the same machine, but exceptions, called cross-compilers, exist.

Obviously, there is no value in

since implementations of both the object and implementation languages must exist before the source language can be compiled. Obvious, but false. The first case, source and object languages the same, is called a program transformer, or in some cases an optimizer. It is usually intended to convert a given program to a more efficient program in the same language, but it might improve the program along other dimensions than run-time efficiency (for example, there is some experimental success in automatically converting interpreters, which are easier to write, to compilers, which are more efficient for many purposes). The most common optimizers operate on low-level code, such as assembly language, machine language, or a specially designed intermediate code. They are usually viewed as part of a larger compiler.

The use of compilers with the same source and implementation language is quite common and important as an implementation tool. Suppose that we have designed a new programming language, called Z. We might first write an easy but inefficient compiler or interpreter for Z, called EZZ, in whatever language Y is already implemented for us. EZZ might implement only a portion of the language Z. Then, we write an excellent compiler for Z, called XLZ in Z itself (using only the portion of Z that our first inefficient implementation can process). Now, we compile or interpret XLZ using EZZ, and execute XLZ on itself. The result is high-quality compiled code for XLZ. This trick is widely used, and XLZ in this case is called a "bootstrapping compiler," or a "self-compiling compiler." (Exercise: can we gain anything by recompiling XLZ using the higher-quality output of the first compilation? Ability to answer this question quickly is a good indicator that you understand the idea of bootstrapping.)

What was gained by the bizarre trick of bootstrapping?


Michael J. O'Donnell <odonnell@cs.uchicago.edu>
Last modified: Mon Jan 2 14:37:01 1995