The problem

There are two primary issues to make a language metacircular

  • Correctness: That is essentially, to avoiding infinite recursion by breaking the cycle. Or ensure the GC reclaims more garbages than it generates when trying to reclaim it. It's not trivial but if you are careful, it's not really hard either.
  • Efficiency: Now, that's another story. To my knowledge, the only systems that achieve metacircularity with good efficiency (that is, similar to C) are implemented in a subset of the language that essentially maps to C. There is nothing (in theory) that stops a high level language to achieve that level of performance using high level semantics. Of course, writing efficient low level code in such a language is still going to be hard. You will have to have as much understanding of the system as you do in C. But it can be much more pleasant. I believe that what makes this hard in practice is the lack of good compilers for HLLs (see Java compilers, who are highly optimized for the C subset of the Java language, and hardly optimize high level code: there is no unboxing for arithmetic in the current production JVMs).

Some (obvious?) reflections

Even with all the glory of adaptive dynamic feedback profiling just-in-time recompilation, code will not magically achieve the performance of C. After all, we can write inefficient code even in C (Fortran compilers are far better to optimize some forms of computations thanks to better aliasing semantics). While some very fancy runtime system can sometimes achieve some heroic feats, declaring our intent to the compiler seems to be an easier way to write highly stable, production quality code with reliable performance. However there is a fine line between giving the compiler well defined semantics, and writing portable assembly (aka writing C code).

There is not much intent we can express in C, and while a high level language provides the adequate framework for a more civil cooperation with the compiler, the full semantics of an open dynamically typed system introduce uncertainties that cripple low level optimization. The obvious 'solution' is to write code in a language subset that has C semantics. This doesn't solve the problem, but avoids it. A more promising approach is to introduce self-imposed limits that the compiler can exploit.

Sealing looks like a promising way to guide the compiler into writing optimal code without having to write it ourselves. However, it imposes severe restrictions on code reuse, and affects generic high level code, where limits are a nuisance that dynamic languages are engineered to avoid. Dynamic Sealing might provide a more powerful mechanism that allows fine-grained control of the semantic limitations we want to apply on some specific code.

This revision created on Wed, 22 Oct 2008 09:48:08 by prunedtree