Factor/FAQ/Implementation

Is Factor compiled or interpreted?

If, by this, you actually mean "Does Factor produce standalone executables," the answer is yes; see Deployment.

Factor source code is compiled to native machine code before it can be executed. There are two compilers built-in to Factor:

  • a minimal, non-optimizing JIT compiler (written in C++)
  • an optimizing compiler (written in Factor)

The compilers are never invoked directly; loading source files, or running code constructed at runtime, will invoke the appropriate compilers automatically. Most code will get compiled with the optimizing compiler, except for code which constructs new code at runtime.

See The implementation.

Why isn't Factor fully self-hosted?

Factor is primarily a high-level language for application programming, and cannot be used to implement a garbage collector, because the language is itself garbage collected with some constructs performing implicit allocation in non-obvious ways. Features such as inline caching would also be difficult to implement in a language with Factor's high-level semantics. So Making Factor self-hosted would actually mean rewriting the virtual machine in a new, low-level DSL, which was based around a subset of Factor but added direct memory access and eliminated all runtime type information. A VM-level DSL would not be interactively debuggable or replaceable, and none of Factor's high-level tools or abstractions would be present. This DSL would not offer any real advantages over C, C++, or some other systems level language.

Why aren't the C++ components of Factor implemented in my favorite other systems language?

GCC optimizes well, and is available on all platforms that Factor targets. GDB complements GCC well for low-level debugging. The abstractions offered by C++ are adequate for what needs to be represented in the VM; for example, see http://factor-language.blogspot.com/2009/05/factor-vm-ported-to-c.html for a discussion on the use of smart pointers for flagging local variables which need to be traced by the GC. Other languages such as D may offer minor advantages and clean up certain algorithms in the Factor VM, but the hassle of introducing such a dependency on another language outweighs the benefits, if any.

Why not Factor for the Java Virtual Machine or .NET?

Originally, Factor was written in Java, and there was a compiler to JVM bytecode. In 2004, the Java parts of Factor were rewritten in a mix C and Factor. The new "CFactor" implementation eventually replaced the earlier "JFactor". The runtime system was subsequently rewritten from C to C++ in 2009. Issues with the Java implementation that led to it being phased out are discussed in Java Factor.

Why doesn't Factor use LLVM?

While LLVM is a fine backend for a compiler implementation, and implements several optimizations which are more advanced than Factor, there are several reasons why Factor's code generator is a custom one, written in Factor. Having a compiler written entirely in Factor is a good test for the language. Several new idioms and abstractions came up as a result of compiler development. If part of the code generator instead called out to C++, there would be an impedance mismatch and a need to maintain extensive Factor-LLVM bindings. To duplicate some of the low-level optimizations that Factor performs, new LLVM passes might need to be written. Developing new LLVM optimization passes in Factor would entail a lot of additional work to marshall and unmarshall LLVM's data types to Factor objects, and the resulting Factor code would be less than idiomatic due to issues such as manual memory management. Writing them in C++ would increase the size and complexity of the Factor VM.

One final issue is related to deployment. LLVM is a large external dependency, and one of Factor's goals is to have the entire UI development environment run without any external libraries, beyond those shipped with the OS. This ensures that Factor is easy to build and install.

If LLVM were investigated in the future as a target backend for Factor's compiler, it would primarily replace the code in the compiler.cfg vocabulary. The stack-checker and compiler.tree vocabularies, which comprise the compiler frontend and high-level optimizer, contain language-specific optimizations that are beyond the scope of what LLVM is designed to do, and the code there would remain in some form, possibly with a conversion step from Factor's high-level IR into LLVM bitcode.

This revision created on Sun, 13 Jan 2013 02:19:52 by FGrose (typo)