Concatenative language/Multiple return values

Stack languages make multiple return values easier to work with than applicative languages. A typical approach in an applicative language is to package the values into a tuple. This works in stack languages too, and is useful if you want to treat the multiple values as a single unit (for example, a 3D point with 3 components); however often it is easier to just push 2 or 3 items on the stack and be done with it.

Most words in Factor still return no values or a single value, but multiple return values are very handy when they come up. A good motivating example is the following. The at word takes a key and a hashtable, and returns a value, or f. This is similar to hashtable lookup in Java and Python, and suffers from a problem: there is no way to differentiate between a key absent from a hashtable, and a key whose value is f. Factor offers another word, at*, which returns two results: the value, and a boolean indicating whether it is present. So we get f f if the key is not in the hashtable, and f t if the key is set to f. Because the boolean indicating the presence of the key is at the top of the stack, you can just use this to check:

at* [ ... handle value ... ] [ ... no value ... ] if

Indeed, the at word is defined in terms of at*:

: at ( key assoc -- value ) at* drop ;

This is just the beginning though, and Haskell's pattern matching, or Common Lisp's multiple values support, can easily achieve the same effect with a little more verbosity (you'd have to name the two output values even if you only use them once).

A more interesting case comes up if you have a conditional where each branch outputs two values, which are then consumed by another word:

[ foo ] [ bar ] if +

Here foo and bar output two values and + consumes two values. Just like the earlier example with null-safe object member access, this type of pattern can be expressed directly in a stack language, but requires weird contortions in applicative languages.

Perhaps the most unusual application of multiple return values is the concept of "modifiers". Suppose you have a family of words, do-this, do-that, and do-it, which all take the same input parameters, X, Y and Z. You can write a new word, frob, which takes the X, Y and Z as input, modifies them, and outputs them. Now you have seven operations:

do-this
do-that
do-it
frob
frob do-this
frob do-that
frob do-it

But you only defined four words.

Here is a concrete example. Factor has two words for extracting subsequences, subseq and <slice>. They both take the same parameters: a start index, an end index, and a sequence. They differ in that subseq copies the elements into a new sequence, whereas <slice> creates a "virtual" sequence which is simply a view of the underlying sequence; mutating the underlying sequence will also affect the view, the view does not need additional memory, and so on. Very often though, you want a subsequence from the start or end of a sequence. For that, there are a couple of words, (head) and (tail), which take a sequence and an index, and output a start index, an end index, and the same sequence. The (head) word always outputs a start index of 0, and the (tail) word outputs an end index equal to the length of the sequence. Another common operation is instead of asking for the first 5 elements, you want the last 5 elements. For this, there is a modifier word from-end. Using subseq, <slice>, (head), (tail) and from-end, the Factor library defines 8 high-level words:

: head ( seq n -- subseq ) (head) subseq ;
: tail ( seq n -- subseq ) (tail) subseq ;
: head* ( seq n -- subseq ) from-end head ;
: tail* ( seq n -- subseq ) from-end tail ;
: head-slice ( seq n -- slice ) (head) <slice> ; inline
: tail-slice ( seq n -- slice ) (tail) <slice> ; inline
: head-slice* ( seq n -- slice ) from-end head-slice ; inline
: tail-slice* ( seq n -- slice ) from-end tail-slice ; inline

There is another modifier, short. It implements the case where you want the first, say, 5 elements of a sequence, but if the sequence is shorter than 5 elements, you want the whole sequence, instead of an out of bounds error. It is implemented as follows; it is a modifier for the words above:

: short ( seq n -- seq n' ) over length min ; inline

So using the above 8 words and short, we get 16 different possible operations, with very little code; only 6 "real" words.

This revision created on Sun, 4 Oct 2009 15:14:42 by DK (frob is as well an operation)