Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ACF3AB604 for ; Mon, 2 Jan 2012 23:38:59 +0000 (UTC) Received: (qmail 93452 invoked by uid 500); 2 Jan 2012 23:38:58 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 93415 invoked by uid 500); 2 Jan 2012 23:38:58 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 93406 invoked by uid 99); 2 Jan 2012 23:38:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jan 2012 23:38:58 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kevin.r.coombes@gmail.com designates 209.85.160.180 as permitted sender) Received: from [209.85.160.180] (HELO mail-gy0-f180.google.com) (209.85.160.180) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jan 2012 23:38:48 +0000 Received: by ghrr20 with SMTP id r20so6476123ghr.11 for ; Mon, 02 Jan 2012 15:38:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:date:from:organization:user-agent:mime-version:to:cc :subject:references:in-reply-to:content-type; bh=AL4V60HSFR/hYVE0kcuuhU5Y8/ynMn1Jx0DiVdztxt0=; b=S2vzK3cfuCFdyYqsobvrKbHCWbJ8bnh9C+VH3umIu/Wuo7IV2vNJYb3sBYyZPBVwLH FO8lQPunt2m4uIdDERjAggoG6otAechA6EqEu4aN9T/zvpcleH3tebZXueFtH3T5vr/L IiYdeh9o/uwaRBoZkszIkHW5+KCU5AeNACLnY= Received: by 10.236.175.199 with SMTP id z47mr18762664yhl.93.1325547507149; Mon, 02 Jan 2012 15:38:27 -0800 (PST) Received: from [10.105.35.136] ([143.111.22.28]) by mx.google.com with ESMTPS id o9sm71305444yhk.20.2012.01.02.15.38.25 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 02 Jan 2012 15:38:26 -0800 (PST) Message-ID: <4F023FF1.3070708@gmail.com> Date: Mon, 02 Jan 2012 17:38:25 -0600 From: "Kevin R. Coombes" Organization: UT M.D. Anderson Cancer Center User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: user@couchdb.apache.org CC: Jo-Erlend Schinstad Subject: Re: Modeling a tree in couchdb. References: <4EF45C4A.5080902@gmail.com> <4F0164FE.7050207@gmail.com> <4F01E3A1.2080402@gmail.com> <4F021A1D.4010502@gmail.com> In-Reply-To: <4F021A1D.4010502@gmail.com> Content-Type: multipart/alternative; boundary="------------060006080009070902030407" X-Virus-Checked: Checked by ClamAV on apache.org --------------060006080009070902030407 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Your last few posts have confused me as to the actual requirements. Suppose I have the following tree, where the numbers on the nodes are arbitrary IDs that i have assigned. [1] | ----------- | | [2] [3] | | ----------------- --------- | | | | | [4] [X] [5] [6] [7] Assume for the moment that the node labeled [X] does not yet exist. Then this tree would have something like the following JSON/Couch representation: { '_id': 1, 'ancestors': [] } { '_id': 2, 'ancestors': [1] } { '_id': 3, 'ancestors': [1] } { '_id': 4, 'ancestors': [2, 1] } { '_id': 5, 'ancestors': [2, 1] } { '_id': 6, 'ancestors': [3, 1] } { '_id': 7, 'ancestors': [3, 1] } Each _id has been assigned manually rather than automatically. The list of 'ancestors' for each item simply gives the ids that you trace through to march back up the tree from that node to the root. If you want to add a new node at position X, then you decide which id to give it and simply create a new node that looks like { '_id': 'X', 'ancestors': [2, 1] } Nothing else needs to change. The hard part would only come if you wanted to, for example, insert a new node [Y] above node [3] in the tree. In that case, you would have to update all of the nodes in that branch, producing new items: { '_id': 'Y', 'ancestors': [1] } { '_id': 3, 'ancestors': ['Y', 1] } { '_id': 6, 'ancestors': [3, 'Y', 1] } { '_id': 7, 'ancestors': [3, 'Y', 1] } Assuming fairly balanced branching, inserting a node above a branch at level N, you would expect to have to updated 1/2^N of the nodes, which should usually be a fairly small fraction. Your answers suggest that you _also _want to keep track of the number of siblings of each node in some sequential order, which is a different requirement than keeping track of the hierarchical tree structure. You can achieve this other goal by adding another item to the representation of each node. The simplest idea would be to add a 'children' property that lists (in implicit order) all of the children of a node. In that case, for example, node [2] would originally be written as { '_id': 2, 'ancestors': [1], 'children': [4, 5] } and would need to be updated to { '_id': 2, 'ancestors': [1], 'children;: [4, 'X', 5'] } when you inserted the node [X]. Nothing else would have to change. Of course, you might also want a complete list of all nodes in the order in which they were added to the tree. You could accomplish that goal by adding yet another property that allowed you to keep the items ordered, independent of the tree structure. Other than replacing the root node by something above it, I cannot see a reason to ever have to update all of the nodes. What am I missing? Kevin On 1/2/2012 2:57 PM, Jo-Erlend Schinstad wrote: > Den 02. jan. 2012 18:04, skrev Kevin R. Coombes: >> Take this suggestion with a grain of salt, since I haven't actually >> tried it and am just making things up.... >> >> If your structure is really a tree, then the location of every node >> is characterized by a unique path back to the root node, You could >> save the entire path in each object as a list: >> [parent, grandparent, great-grandparent, ..., rootnode] >> One view would simply return this entire list for each object. >> A second view would just give back the parent node. >> Either view can be used (with appropriate logic in the client) to >> reconstruct the entire tree. However, you could also easily create >> auxiliary views (e.g., grandparent) depending on your needs. >> >> This organization makes querying the tree relatively easy. However, >> it will have _terrible _performance if you do a lot of surgery on the >> tree, lopping off branches and reattaching them in different places. >> > > That was my original thought. I wanted to do something like key [0, > 5, 4,2] which would mean the sixth child of the first top-level node, > and the fifth child of that node and the third of it. > > This would work well, and I would get the tree in one go, organized > and nice. The problem is that the tree must be reorganizable. In that > model, if moved one node, then I would have to update all the > following documents. If there's a million rows in the tree, then I > would need something like 999.990+ http requests... :) > > Further, one of primary goals is replication. It could never work. The > internet janitor would hit me in the back of the head with a wrench. > However, for something like a family tree or a blog, where something > will forever be a response to something else, that would work nicely. > > So what I'm doing now, is that I'm just retrieving the data from the > database and organising it on the client. It's not a comfortable > solution, I think. It's not elegant, but if it's the only possible > solution, then it doesn't really matter if it's elegant or not. :) > > In other words, I'm still very much looking for a good way to model a > flexible tree in a couchdb. Any suggestions are highly welcome. > > Jo-Erlend Schinstad --------------060006080009070902030407--