On Dec 23, 2011, at 3:26 AM, CGS wrote:
An idea I got from the CouchDB documentation: use floating point numbers
instead of integers. That is, instead of 1, 2, 3, 4..., you can use 0.1,
0.2, 0.3, 0.4... or 1.0, 2.0, 3.0, 4.0... That can help you to insert a
new node (say, 0.15 or 1.5) in between any two given nodes (say, in
between 0.1 and 0.2 or in between 1.0 and 2.0) without any other
modification.
This really only postpones the problem, though, since floatingpoint has finite precision.
With standard doubles you’ll get about 50 levels of insertion before you end up with two
numbers that you can’t fit another number in between. (You’ll get fewer levels if you
start out with a large number of nodes because there will be less room left over for fractions.)
“50 levels” may sound like a lot, but it isn’t really; think of first creating two nodes,
then repeatedly inserting a new node after the first one, which is a likelysounding scenario.
After inserting 5060 nodes that way you’re stuck.
Another issue is that you’ll get roundoff errors when the numbers are converted to/from
decimal, i.e. when fetching documents and during replication. And that’s at the mercy of
the JSON library being used, how many significant digits its implementor decided to use for
floating point. It would be a lot safer to do the equivalent thing with integers — i.e.
number the initial rows 0x100000000, 0x200000000…
Lena Herrmann’s thesis "Implementation of a distributed application using the documentoriented
database CouchDB” goes into the problem of tree representations in some detail.
—Jens
