Factor/Metacircularity

The problem

There are two primary issues to make a language metacircular

  • Correctness: That is essentially, to avoiding infinite recursion by breaking the cycle. Or ensure the GC reclaims more garbages than it generates when trying to reclaim it. It's not trivial but if you are careful, it's not really hard either.
  • Efficiency: Now, that's another story. To my knowledge, the only systems that achieve metacircularity with good efficiency (that is, similar to C) are implemented in a subset of their language that essentially maps to C. There is nothing (in theory) that stops a language to achieve that level of performance using high level semantics. Of course, writing efficient low level code in such a language is still going to be hard. You will need to understand the system as well as you do in C. You might also need to understand the semantics of your language and how the compiler optimizes them. But it can be a much more pleasant experience. I believe that what makes this hard in practice is the lack of good compilers for HLLs (see Java compilers: they are highly optimized for the C subset of the Java language, and hardly optimize high level code: there is no unboxing for arithmetic in the current production JVMs).

Some (obvious?) reflections

Even with all the magic of adaptive dynamic feedback profiling just-in-time recompilation, code will not reliably achieve the performance of C. After all, we can write inefficient code even in C. While a very fancy runtime system will sometimes achieve some heroic feats, declaring our intent to the compiler seems to be an easier way to write stable, production quality code with reliable performance. However there is a fine line between giving the compiler well defined semantics, and writing portable assembly (aka writing C code).

There is not much intent we can express in C, and while a high level language provides the adequate framework for a more civil cooperation with the compiler, the full semantics of an open dynamically typed system introduce uncertainties that cripple low level optimization. The obvious 'solution' is to write code in a language subset that has C semantics. This doesn't solve the problem, but avoids it. A more promising approach is to introduce self-imposed limits that the compiler can exploit (see Fortran: the compilers are very good at optimizing some forms of computations thanks to better aliasing control)

Sealing looks like a promising way to guide the compiler into writing optimal code without having to write it ourselves. However, it imposes severe restrictions on code reuse, and affects generic high level code, where limits are a nuisance that dynamic languages are engineered to avoid. Dynamic sealing might provide a more powerful mechanism that allows fine-grained control of the semantic limitations we want to apply on some specific code.

This revision created on Wed, 22 Oct 2008 20:16:31 by mnestic (/Factor/Dynamic sealing not working)