incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Black <...@alwaysreformed.com>
Subject Re: Tree like structures in CouchDB
Date Sat, 23 Feb 2013 18:44:09 GMT
I'd love to see the various key/parent ID structure/query strategy
proposals that have been mentioned in this thread summarized on a wiki,
with a chart of which requirements mentioned below each fulfills.  That
would be awesome.

Tim

On 02/22/2013 11:02 PM, Michael Heuberger wrote:
> Thanks Jim.
>
> Ok, here are more details / requirements for my needs:
>
>  * Very important is to get the parents and their parents above, up to
>    about 200 levels I'd say.
>  * On the other hand, for parents it's not that important to know their
>    children.
>  * Yes, lazy loading please. There is no need for dynamic tree control.
>  * I don't need to find specific subtrees.
>
> That's all. Any suggestions?
>
> Michael
>
> On 23/02/13 17:42, Jim Klo wrote:
>> Michael,
>>
>> It's hard to gauge what's best - as it depends on your use case...
>> Possibly provide more details?
>>
>> Q: What are your navigation requirements? Drill up/down, jump to
>> other trees, etc.
>>
>> Q: Are you building or in need of a dynamic tree control? Do you want
>> to lazy load the elements?
>>
>> Q: Do you need to find specific subtrees? I've built some tree
>> visualizations with hundreds of thousands of nodes - the reality is
>> you can't actually physically display anything human readable with
>> even a few thousand tree nodes.
>>
>> If you can start defining how you need to use the data - the best
>> model for storage and retrieval will weed itself out.
>>
>> FWIW, I'd like to understand Wendall's solution, as I don't
>> understand how to recreate the tree from his description. Sorry
>> Wendall! :(. I swear I've seen you describe this on list before, but
>> couldn't find it in the archives.
>>
>> I think both potentially are reasonable approaches, and both have
>> pros and cons for usage.
>>
>> - Jim
>>
>> On Feb 22, 2013, at 4:29 PM, "Michael Heuberger"
>> <michael.heuberger@binarykitchen.com> wrote:
>>
>>> Sorry guys, but now you left me a bit confused.
>>>
>>> Again, what's the recommendation to implement tree structures for
>>> messages in CouchDB?
>>>
>>> Cheers
>>> Michael
>>>
>>> On 23/02/13 11:42, Jim Klo wrote:
>>>>> I wanted to give my feedback about what I've learned in this area.
>>>>>
>>>>> First, I don't use the doc _id at all for sorting docs. It solves
>>>>> one single use-case, but fails if you have others, so instead, I
>>>>> do this:
>>>>>
>>>>> Every doc, whether the parent or child has identifying
>>>>> information. So a child might contain the parent id, thread id,
>>>>> etc. Parent doesn't need to know about it's children so it doesn't
>>>>> matter, as those can be pulled in a single view query.
>>>>>
>>>>> Say I want to do something as originally stated, I'd create a view
>>>>> where I emit([parent_id, next_level_id, next_level_id], null) with
>>>>> default values for the latter nested levels being 0 by default.
>>>>> When I query the view, I get back a result set that would look
>>>>> like the following.
>>>>> [
>>>>> {"id":"0f1e244b14452a884f3dfa5b4086f793","key":[1, 0,
>>>>> 0],"value":null}, <- parent
>>>>> {"id":"27f4c6bb9bcaad331e68f80629bffa6e","key":[1, 1,
>>>>> 0],"value":null}, <- first level
>>>>> {"id":"46c17a23254c2dcce0860b4c398e0009","key":[1, 1,
>>>>> 1],"value":null}, <- first item in first level
>>>>> {"id":"95903e4c2e2cbb5e2dfbc934adf6095f","key":[1, 1,
>>>>> 2],"value":null} <- second item in first level
>>>>> ]
>>>> you would still need to track ancestry in most cases,… the second
>>>> solution makes that possible… also your example only works for a
>>>> single 'giant' tree, unless I'm missing something… and not a
>>>> forest. I'm also not seeing  how you would get all the nodes
>>>> without having to execute a query for every node on the tree -
>>>> which is pretty inefficient IMHO
>>>>
>>>> also as others have noted - keeping track of an independent serial,
>>>> for the sake of just ordering the tree, with concurrency would be a
>>>> real challenge; which is why I use serial ID's.
>>>>
>>>>
>>>>> To pull the entire thread based on the parent query is simply
>>>>> startkey=[1,0,0]&endkey=[1,{}]
>>>> then is your parent_id, really a root_id?  Then I'm really confused
>>>> how you would use this with trees at all…  I'm not sure how you
>>>> model as I'd get duplicates from which I could never use to
>>>> reconstruct the tree:
>>>>
>>>> - A                    A root                [A, 0, 0]
>>>>     - B                1st child of A            [A, 1, 1]
>>>>         - C            1st child of B            [A, 2, 1] ???
>>>>         - D            2nd child of B        [A, 2, 2] ???
>>>>     - E                2nd child of A        [A, 1, 2]
>>>>         - F            1st child of E            [A, 2, 1] ???
>>>>             -G        1st child of F            [A, 3, 1]
>>>>         - H            2nd child of E        [A, 2, 2] ???
>>>>    
>>>>> The advantage of this approach is simply that say I want to
>>>>> display a list of all posts by user for a specific thread, I can
>>>>> create a view where I emit([parent_id, user_id, comment_id], null)
>>>> you could do this with either approach, it's not really an advantage.
>>>>
>>>>> This gives the ability to pull a specific comment for a user based
>>>>> on user_id and thread_id, or an entire list of comments based on
>>>>> user_id. These sorts of indexes are very cheap and flexible. You
>>>>> never have to mess with creating your own custom id system. Of
>>>>> course, the tradeoff is that you have to do your own conflict
>>>>> resolution for async operations with thread ids if you want them
>>>>> to increment. Better solution here is to use both timestamp and
>>>>> user_id for the actual comment to ensure it is unique and still
>>>>> sorts well.
>>>> again serial id's solve that (not the default UUID's couchdb
>>>> issues, AFAIK they are not incremental, however I could be wrong),
>>>> is there a reason you want to avoid having a smart id?
>>>>
>>>> FWIW: this all seems like deja vu
>>>> http://markmail.org/search/list:org.apache.couchdb.user+modelling+a+tree+in+couchdb+date:201112-201201+
>>> -- 
>>>
>>> Binary Kitchen
>>> Michael Heuberger
>>> 4c Dunbar Road
>>> Mt Eden
>>> Auckland 1024
>>> (New Zealand)
>>>
>>> Mobile (text only) ...  +64 21 261 89 81
>>> Email ................  michael@binarykitchen.com
>>> Website ..............  http://www.binarykitchen.com
>>>
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message