How to Write a Compiler in an Afternoon

The tools for compiler construction have gotten so good that building a JIT or native code compiler for a custom language can be written in just a few hours.  I’ve implemented an example compiler in about 220 lines of Haskell. It has a fair amount of commentary so it shouldn’t be too hard to follow.

The toy language that I implement is pretty basic. It only has very basic arithmetic, only integer types, comparison, if – then – else, function calls and function definitions, but I think it’s a good demonstration of what is doable with today’s tools.

Unfortunately, this took me longer than a afternoon to finish. I discovered that, while the Haskell LLVM bindings for JIT compilation of code segments that are specified in actual Haskell code are quite good, they do not provide a good interface for code generation from an AST. I had to make some changes to the bindings to expose the needed functionality. However, I think that with a little more improvement in the LLVM bindings, it shouldn’t be any trouble to write a simple compiler in an afternoon.

There are LLVM bindings for many other languages including Python and Ruby. So it’s possible to build a similar compiler in the language of your choice. One advantage of doing this in a compiled language, such as Haskell, is that I can statically link in the JIT compiler and can run scripts in our toy language at native speeds without needing the LLVM toolset installed. (However, the compiler executable is 18M, but you get what you pay for)

One particularly interesting use for such a compiler would be for generating and running JIT code in Python or Ruby. That way you could have the performance of a compiled language when you needed it, but the convenience of a scripting language for everything else.

 

 

Conversation
  • Dax Fohl says:

    Awesome!

  • […] parts of working with C# in the past year.LambdasA familiar concept to many of us who use Ruby, Haskell, or other languages that encourage functional programming paradigms, lambda functions are available […]

  • Comments are closed.