Exotic Data StructuresOr maybe not so exotic...
The last one is starting to sound exotic though. 1. Compact dynamic array (compactarrays)Based on the deque variant of [1]. This is simply a O(sqrtN) array of O(sqrtN) subarrays. Two lists of arrays are maintained, small and big (twice bigger) Also, pointers to head/tail indexes, and the big/small separation are maintained. Conceptually, the virtual array is the concatenation of all small subarrays followed by the big subarrays, and indexed between head/tail. All operations are straightforward. In order to maintain the global invariants, small subarrays are sometimes merged into big arrays, or big arrays are split into small arrays (this happens at the boundary between the two lists). when one of both lists is empty, lists are swapped (big arrays become small and viceversa) Variant: Compact integer arrays Has the same time complexity bounds, but takes NlgM +o(N) bits of space in the worst case, where M is the biggest integer stored in the array. This is implemented by growing subarrays dynamically when an update overflows. It makes updates O(sqrtN) worst case (amortized if O(N) numbers take O(lgM) bits). Asymptotic complexity:
2. Monlithic and Succinct lists (monolithiclists, succinctlists)Monolithic lists are lists stored without any managed pointers, in a flat array. Each node in the list is identified by a number in [1..N]. 0 is the head and 1 the tail. Insertion allocates nodes, removal reclaims them. Because the links are xorcoded, two consecutive nodes are needed to form a cursor. given a cursor, you can query the list for the next or previous element. To store values at each node, you can simply use node identifiers to index a [1..N] array. This makes cache behavior even worse, but it doesn't matter much, as indexing lists is slow anyway. note: unrolled lists are isomorphic to lists of arrays. they offer no improvement. Asymptotic complexity:
A succinct data structure has space usage close to the information theoretic lower bound. In essence, a list is isomorphic to an Hamiltonian cycle in a complete graph of n nodes. There are O(N!) such cycles. Thus at least lg(N!) + o(N) bits are needed to encode a list. This gives us NlgN + o(N) as a lower bound. Succinct lists achieve this lower bound (tightly) and can still operate efficiently. To achieve this, succinct lists are simply monolithic lists stored in the previously defined compact integer arrays. Asymptotic complexity:
3. SplitHashtables (splithashtables)A variation of cuckoo hashing, that doesn't move anything around. Also inspired by [2]. In essence, this structure is a pair of arrays of lists of key/value pairs. Lookup is done by splitting the hash of the key in two, using this pair to index the arrays, and then search each list for the key. The twist is that insertion is always in the shortest list. Surprisingly, that gives us an informationtheoretic advantage that reduces the expected worst case list length significantly. This works so well that it's a waste of time to make any effort to move elements around (as in cuckoo hashing). Also, the structure can operate efficiently an load. Of course, to get the good performance, some tweaks can make the difference. For space usage: Single element lists are inlined into the arrays (there are two words to accomodate for a key/value pair). The lists themselves are stored as arrays (with the usual amortized growth). This ensures that the data structure stays efficient in all usage cases: sparse entries do not waste too much space, and long lists converge to optimal density. For speed: The most critical operation for an unordered set is to answer queries. This is divided into five stages in split hashtables:
Insertions and removal are performed with a variant of query that returns the key/value pair by reference. Removals query for the key and and replace the key/value pair with ((free)) f. (compaction is done only on demand) Insertions query for a ((free)) key and store the key/value pair over it. If the query fails, the appropriate list is copied into a large array completed with ((free)) f key/value pairs. A load of 50 Asymptotic complexity:
4. B+ trees, Adaptive Packed memory arrays5. van Emde Boas trees, yfast trees, constant time approximate priority queuesReferences:[1] A. Brodnik, S. Carlsson, E. D. Demaine, J. I. Munro, and R. Sedgewick. Resizable arrays in optimal time and space. Technical Report CS9909, U. Waterloo, 1999. http://www.cs.uwaterloo.ca/_{imunro/resize.ps} [2] M. Mitzenmacher. The power of two choices in randomized load balancing. Parallel and Distributed Systems, IEEE Transactions on. Volume 12, Issue 10, Oct 2001 http://www.eecs.harvard.edu/_{michaelm/postscripts/mythesis.pdf} This revision created on Mon, 8 Dec 2008 12:35:24 by prunedtree 

All content is © 20082010 by its respective authors. By adding content to this wiki, you agree to release it under the BSD license. 