incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jo-Erlend Schinstad <joerlend.schins...@gmail.com>
Subject Re: Modeling a tree in couchdb.
Date Tue, 03 Jan 2012 15:16:44 GMT
Hello again. Thanks for your suggestion.

Den 03. jan. 2012 00:38, skrev Kevin R. Coombes:
> Your last few posts have confused me as to the actual requirements.  
> Suppose I have the following tree, where the numbers on the nodes are 
> arbitrary IDs that i have assigned.
>                      [1]
>                       |
>                  -----------
>                  |         |
>                 [2]       [3]
>                  |         |
>     -----------------   ---------
>     |        |      |   |       |
>    [4]      [X]    [5] [6]     [7]
>

But if I want to insert a branch between 2 and 3 and still keep it 
ordered? Wouldn't I then have to give a new path to 3, 6 and 7? One 
solution that has been suggested is to use decimal values, so that I 
could do 2+3/2 to get 2.5 and use that as its path. Then, if I wanted to 
add a node between 2.5 and 3, I would do 2.5+3/2=2,75 and the tree would 
look like this:

                                 [1]
                                  |
                         --------------------------------
                         |        |            |              |
                        [2]    [2,5]      [2,75]      [3]

The paths for 2 and 3, including all their child nodes would stay the 
same. If there are more siblings on that level, then they would also be 
intact with no change. However, this would be limited because there's a 
finite number of decimals. And there is another problem. If I now want 
to move 2.75 before 2.5, then I would do 2+2,5/2=2.25. The first level 
below the root would now be [2, 2.25, 2.5, 3], but since 2.75 is now 
called 2.25, I would also need to update the paths of all nodes under 
it. This would also mean that the tree would be in an inconsistent state 
between updating 2.75 itself and the updates to its child nodes. If 
there is a large number of elements, then this can take a considerable 
amount of time.

So, the current idea (that I got from Lena Herrman) is that each element 
only points to its direct parent and to its next sibling. That way, any 
operation would only require two writes, regardless of the size of the tree:

                                 [1]
                                   |
                         -----------------------------
                         |                                 |
                       [2]                              [3]
                         |                                 |
                 ---------------                ------------
                 |              |                  |             |
                [4]          [5]               [6]          [7]

(Yes, same tree, just repeated for your convenience:))

In this case, the JSON would look something like:
{"_id" : "uuid1", "label" : "1", "parent" : null, "next_sibling" : null, 
"root_node" : true }
{"_id" : "uuid2", "label" : "2", "parent" : "uuid1", "next_sibling" : 
"uuid3", "first_child" : true }
{"_id" : "uuid3", "label" : "3", "parent" : "uuid1", "next_sibling" : null }
{"_id" : "uuid4", "label" : "4", "parent" : "uuid2", "next_sibling" : 
"uuid5", "first_child" :true }
{"_id" : "uuid5", "label" : "5", "parent" : "uuid2", "next_sibling" : null }
{"_id" : "uuid6", "label" : "6", "parent" : "uuid3", "next_sibling" : 
"uuid7", "first_child" : true }
{"_id" : "uuid7", "label" : "7", "parent" : "uuid3", "next_sibling" : null }

So now, if I wanted to move 2 to 7, then I would do

{"_id" : "uuid2", "label" : "2", "parent" : "uuid7", "next_sibling" : 
null, "first_child" : true }
{"_id" : "uuid3", "label" : "3", "parent" : "uuid1", "next_sibling": 
null, "first_child": true }

If, instead, I wanted to move 5 before 4, then I would do

{"_id" : "uuid5", "label" : "5", "parent" : "uuid2", "next_sibling" : 
"uuid4", "first_child" : true }
{"_id" : "uuid4", "label" : "4", "parent" : "uuid2", "next_sibling": 
null, "first_child": false }

No other writes would be necessary, no matter what you do, or how large 
the changes are. There are also no problems with decimals. You can 
reorder the tree how many times you want. That is a huge improvement, 
but it would still mean that the tree is broken between the first write 
and the second. I think I can do this by adding a field that signifies 
that the first is the beginning of an operation and that the second 
write finishes it.

The problem is how to get the tree in an ordered state from the 
database. I currently sort by root_node, then parent, then first_child. 
This means I can always get the starting points quickly, but I'll have 
to rebuild the tree on the client. It's not really a big problem, but if 
I use the tree in one Python desktop app and one JavaScript browser app, 
then I'll have to do that work twice. I would much rather do it on the 
server so I could just loop through on the client. That would also 
reduce the amount of work that is necessary.

It feels like there should be a better way to do it, but I can't really 
get it to the front of my mind. :)


Jo-Erlend

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message