From user-return-4343-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Wed Apr 08 16:41:53 2009 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 42990 invoked from network); 8 Apr 2009 16:41:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 8 Apr 2009 16:41:53 -0000 Received: (qmail 29019 invoked by uid 500); 8 Apr 2009 16:41:52 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 28949 invoked by uid 500); 8 Apr 2009 16:41:52 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 28939 invoked by uid 99); 8 Apr 2009 16:41:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Apr 2009 16:41:52 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [66.220.1.235] (HELO mail.proven-corporation.com) (66.220.1.235) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Apr 2009 16:41:42 +0000 Received: from [192.168.3.11] (ppp-58-8-65-214.revip2.asianet.co.th [58.8.65.214]) by mail.proven-corporation.com (Postfix) with ESMTP id B1C452C6195 for ; Wed, 8 Apr 2009 23:41:19 +0700 (ICT) Message-ID: <49DCD3AB.6060200@proven-corporation.com> Date: Wed, 08 Apr 2009 23:41:15 +0700 From: Jason Smith User-Agent: Thunderbird 2.0.0.21 (X11/20090318) MIME-Version: 1.0 To: user@couchdb.apache.org Subject: Re: Query design questions References: <35c58c40904080904o59df67a9n79655b2ee7a41f4d@mail.gmail.com> In-Reply-To: <35c58c40904080904o59df67a9n79655b2ee7a41f4d@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hi, Chris. For me personally, even using the word "query" is inaccurate. I think of it is having one input where I place my data, and several outputs of my data (views I define) that list my data in a 1-D array which I can take slices of. Chris Kilmer wrote: > I have some design questions. First about query design and then about > document design. > > 1. Query design: We have a set of existing documents, Project, User, > and Membership. Business constraints: A user is associated with many > projects through membership. A user can also be marked as a project > owner through the membership. A project can also retrieve a list of > it's owners. Given the constraints, some sample documents are: > > User > {'id' : 'xxx'} > {'id' : 'yyy'} > > Project > {'id' : 'aaa'} > {'id' : 'ccc'} > > Membership > {'user_id' : 'xxx', 'project' : 'aaa', 'is_owner' : true} > {'user_id' : 'yyy', 'project' : 'aaa', 'is_owner' : false} > {'user_id' : 'xxx', 'project' : 'bbb', 'is_owner' : true} > {'user_id' : 'xxx', 'project' : 'ccc', 'is_owner' : true} > > What I'm wondering, is: > > 1. What is the easiest way to retrieve all the projects where a user > is the(an) owner? (our current implementation uses two queries) > 2. What is the easiest way to retrieve all the owners for a particular project? > > > On another front, I realize the way the documents above are laid out > is 'very relational'. I would love some ideas about on modeling the > relationships (should I even be using that word?) in a more document > centric manner. There is a relevant article about relations in App Engine which you should read if you haven't. http://code.google.com/appengine/articles/modeling.html Many-to-many is halfway down. In short, besides the "relationship" way, you can keep a list of "foreign keys" right inside your document. (Sorry for mixing the terminology.) CouchDB maps make it easier to massage your data into nice clean lists of data structures that tell you exactly what you need and quickly. In your specific example, what if you had a map that does something like this (not tested) in a projects-by-owner" view? if(doc.type == 'User') emit([doc.id, null], null); else if(doc.type == 'Membership' && doc.is_owner) emit([doc.user_id, doc.project], doc.project); The sort key is a 2-item list. You want "all X about some user" which means all your users will have to clump together in the view output since you can only take an array slice in CouchDB. That is why the key's first item is always a user ID. With all users clumped together, the second item determines sort order per user, and null guarantees that the user document will come first. Beneath that will be all project IDs that the user owns. So if you query with a startkey=[whoever.id, null] and endkey=[whoever.id, {}], then discard the first item in the response, you will have a list of project IDs that the user owns. The advantage of getting rid of the membership documents and going with the list-of-keys method is that you have both the project data and the user data available to you at the same time. So you could loop through your project's owners list and emit similar data as above, but instead of just a project ID, you could emit actual project data so that you wouldn't have to make another request to get followup data. Hope that helps! Your mileage may vary. Some restrictions apply. See site for details. -- Jason Smith Proven Corporation Bangkok, Thailand http://www.proven-corporation.com