Factor/Dynamic sealing

This page is about sealing with dynamic extent. One of the primary intended uses is for Metacircularity, but any situation where the programmer wants to close the polymorphic semantics of code in order to achieve very high, stable performance would benefit from this approach.

Concepts

Sealing

This is a specific kind of restriction to the language. The name is borrowed from Dylan. While my original concept was first designed without knowing the Dylan approach (that I don't know in deep details) I'm now trying to gather any wisdom that might come Dylan. If you think you can help, please contact me on the Concatenative IRC channel (prunedtree).

In essence, sealing means to close some subsystem from further changes. This allows the compiler to do full code analysis and remove any overhead from polymorphism in the limit. As sealed code is meant to have high, stable performamce, it is expected that the compiler will very aggressively optimize it, taking lots of time and producing lots of code if needed. That cost is bounded for several reasons: Because we seal a core subset of the language that is highly optimizable, and because sealed code has, by definition, far less dependencies, it seldom needs to be recompiled.

What is important to understand is first that sealing is a very light restriction: We close the system, but we keep all high level features of the language (that make it so much better than C, even for high level code: managed memory, quotations and polymorphism, among others). Also, sealing offers us an important guarantee for complex, low level code: Only code within the sealed core system will have an effect on our code, which prevents spurious bugs (or worse, severe random performance regressions) to appear because of random user-level code.

So what do we loose ? If only the sealed core system is running... nothing. What we loose is the possibility to extend the sealed code with arbitrary user code (which is what we desire in this case). Note that all the sealed core system can be used with arbitrary extensions when called from user code. This is desirable, and this is why we use dynamic sealing, not static sealing

Dynamic

Dynamic has the meaning of dynamic scope here, not 'dynamic language'. Dynamic scope simply means the scope is in sync with the runtime stack, as if the dynamic state was passed as an argument to every call. The dynamic state here is the 'access control' object. That object is used for subjective dispatch in order to enforce sealing.

Subjective Dispatch

Dynamic sealing is implemented as a very stable (in the sense that the subject doesn't change often) form of subjective dispatch. Subjective dispatch is a simple extension to multiple dispatch were we dispatch on an additional dynamic variable, the subject. because the subject has dynamic extent, you can consider that this is equivalent to adding a 'subject' argument to all functions in the system, and dispatching on it's class for all generic functions.

Usage and 'why it works'

Within the subjective dispatch semantics, we simply need two subjects for dynamic sealing: 'sealed' and 'unrestricted'. (Note that the system might have other subjects, like 'sandboxed' to run untrusted code safely. These share the same subject dispatch architecture).

'unrestricted' is the default subject. It gives access to all classes in the system on every generic function call, with all the expressiveness, extensibility, ambiguity, and performance overhead that this implies.

'sealed' is the sealed subject. We could have several sealed subjects, but It's doubtful that it would help the compiler and would increase complexity (for the programmer) as it would make it more difficult to keep track of things. The sealed core would contain all primitives, lots of basic data structures, and a large amount of useful library code for low level, high performance code (no XML parser... you get the idea).

The primary goal of sealing is to empower the compiler to gives us worst case guarantees, like producing code that doesn't allocate: for 'unrestricted' code this is impossible as soon as you have a single generic function call on a heap value. Indeed, the user could choose to add a method to that generic function that uses allocation, and then create some value on heap that will use this function. Because statically proving we cannot encounter that specific heap value in the code we compile would be a global dataflow analysis problem (that we cannot realistically hope to solve in the general case), we take another approach: that method isn't visible in sealed code, so the compiler can safely ignore it. should we encounter the special value on heap, we will simply get a no-method exception. Thus, because we can control the methods that implement the generic functions that are visible in sealed code, we can design them such that the compiler can successfully prove things, by analysis of all methods if needed (we have a finite, usually small amount of them).

An important detail is that because the sealing is dynamic, it is not the classes, nor the generic functions, that are sealed. This means that if you do [ [ rest ] [ first ] bi [ + ] reduce ] call, you might need to allocate (maybe you will need bignums). In fact, you don't even know what the elements are: for some (crazy) reason, you might have chosen to use + to concatenate strings ! There are no bignums in the 'sealed' core. the + operation will only dispatch on a finite set of core values: fixnum, int32, int64, fp32, fp64, and fp32 and fp64 versions of complex, for instance. so if you do [ [ rest ] [ first ] bi [ + ] reduce ] sealed, it will never allocate. If int64 overflows, it will throw (we don't use modular arithmetic here, so crash on overflow is desired). If the array contains strings, it will throw. And if the array contains an heterogenous mix of sealed values and the computation doesn't overflow, then the result will be strictly identical to unsealed core (and not much faster - that's not the goal here). It is indeed highly desirable that the semantics of sealed code and unrestricted code do not diverge (sealed code throws if you ask impossible things. silent failure like modular arithmetic is not very useful)

This revision created on Wed, 22 Oct 2008 08:41:31 by prunedtree