Performance of Fractal-Tree Databases

Insertion bottlenecks lie at the heart of database and file-system innovations, best practices, and system workarounds. Most databases and file systems are based on B-tree data structures, and suffer from the performance cliffs and unpredictable run times of B-trees. In this talk, we introduce the Fractal Tree data structure and explain how it works and how it provides dramatically improved performance in both theory and in practice. Although our company Tokutek is selling a transaction-safe fractal-tree storage engine for MySQL, this talk is primarily about the underlying technology.

From a theoretical perspective, if B is the block-transfer size, the B-tree performs O(log_B N) block transfers per insert in the worst case. In contrast, the Fractal Tree structure performs O((log_B N)/B) memory transfers per insert, which translates to run-time improvements of two orders of magnitude.
To relate that theory to practice, we present an algorithmic model for B-tree performance bottlenecks. We explain how the bottlenecks affect best practice and how database designers typically modify B-trees to try to mitigate the bottlenecks. Then we show how Fractal Tree structures can attain faster insertion rates, intuitively by transforming disk-seek bottlenecks into disk-bandwidth bottlenecks
We conclude with performance results showing how a Fractal Tree storage engine can maintain rich indexes more efficiently than B-trees. Surprisingly, Fractal Tree structures seem to maintain their order-of-magnitude competitive advantage over B-trees on SSDs as well as traditional rotating media.