- write a bootstrap scheme interpreter in C, which can be used to
- interpret a scheme compiler, which can be used to
- self-compile, then
- write a Common Lisp interpreter/compiler in scheme, which can be used to
- write a Common Lisp compiler in Common Lisp, that
- self compiles
I am starting with scheme because I'm more familiar with it. The ultimate goal is for me to learn both. I am definitely not going for complete implementations, but rather a reasonable subset of them (for CL at least, it would be daunting).
It so happens that before I got started, I came across Peter Michaux' Scheme from Scratch posts. He is keeping his code as light as possible, in order to make it easier to read. You might want to head over there if you want to see code not obscured by implementation details. Also, other people chimed in with their own implementations.
My code is on github. GPL licensed.
So far, the bootstrap interpreter is able to run (very) simple scheme programs. In the following days, I will be adding more primitives and syntactic constructs, as well as an emacs interface (of course!).
It currently doesn't do garbage collection. I am now not sure if it will ever need one, I'd like it but I don't have an idea right now how can I cleanly extract the lisp objects from the C stack. We'll see.
This blog is for documenting my progress and blunders, so here's the first blunder:
Scheme is required to be properly tail recursive. To simplify, this means that a recursive procedure will call itself (or another mutually recursive procedure) only via a tail call - essentially, a jump to its beginning. This is what makes tail recursive procedures identical to iterative looping constructs; no stack is used for the recursive call, therefore it's fundamental that it's done right.
Fortunatelly, the Revised 5 Report on the Algorithmic Language Scheme (R5RS) nicely enumerates the possible tail contexts (section 3.5).
Most are easy to do, it's tail-sequences that got me into trouble. These are sequences of expressions which are in a tail position with respect to some context. I was fixated on the idea that I must not call eval from within eval (in C, this always - baring optimizations - creates a new stack frame) for the individual sequence expressions.
So fixated that I implemented my own stack inside eval, so I can tail call eval for them, without realising that I'm duplicating what I would otherwise get for free from the function call mechanism in C.
Actually, it suffices if only the last expression of the sequence is a tail call to eval.
This silliness was introduced by this commit. It will be removed.
One more note about the presence of apply inside the eval body, from r5rs: "the first argument passed to apply [...] must be called via a tail call".
Oh yes, the bootstrap interpreter is called minime. That's what the compiler calls it :-)
Niciun comentariu:
Trimiteți un comentariu