Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 17857 invoked from network); 14 Apr 2009 13:44:46 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Apr 2009 13:44:46 -0000 Received: (qmail 74936 invoked by uid 500); 14 Apr 2009 13:44:45 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 74833 invoked by uid 500); 14 Apr 2009 13:44:45 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 74823 invoked by uid 99); 14 Apr 2009 13:44:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Apr 2009 13:44:45 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [66.220.1.235] (HELO mail.proven-corporation.com) (66.220.1.235) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Apr 2009 13:44:35 +0000 Received: from [192.168.3.12] (ppp-58-8-66-22.revip2.asianet.co.th [58.8.66.22]) by mail.proven-corporation.com (Postfix) with ESMTP id DC1EAC5B2D for ; Tue, 14 Apr 2009 20:44:12 +0700 (ICT) Message-ID: <49E49328.7030203@proven-corporation.com> Date: Tue, 14 Apr 2009 20:44:08 +0700 From: Jason Smith User-Agent: Thunderbird 2.0.0.21 (X11/20090318) MIME-Version: 1.0 To: user@couchdb.apache.org Subject: Re: Entity Relationships in CouchDB References: <15C205FF-FDD6-470A-B2CF-A198B817F42A@cisco.com> <49E40963.6070102@proven-corporation.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Wout Mertens wrote: >> 3. Couch is different from App Engine WRT (at least) transactions and >> indexing. Most people can use App Engine somewhat quickly because >> there is still a facility for transactions involving several objects. >> (It's not a RDBMs but you can still do it.) Whereas with CouchDB, the >> unit of atomicity is the document. > > Do you think the document should touch on that? In these simple > examples, there's nothing to transact :-/. Maybe there should be a > recipe book for transaction avoidance? (like using a running sum on bank > account transfers) I actually don't think this document needs to dwell on transactions or performance because it is a clear enough introduction to probably the #1 FAQ question for Couch users: "How do I JOIN?" How much of the App Engine document translates to Couch? Well, with lists, App Engine indexes not only all elements in a list, but all elements *between* lists, which adds up very fast[1], and there is an index size cap which is why they vaguely discourage lists of keys. CouchDB has no such limitation which is why I would prefer the "list of keys" method with Couch until you can explicitly rule it out due to performance or conflict issues. Specifically, I am not sure if the following statement from the page still holds with CouchDB: "You would use this method when there is potentially a large number of contacts in a group and a large number of groups. In that case embedding a list of keys could result in huge documents so we have to resort to writing out the list in many documents." This sentence derives from that vague Google discouragement due to their indexing limitations. I don't think "huge" documents (thousands of IDs is still only a few KB) documents are a big worry. Perhaps write conflicts are more important. What if the document said the following instead? "You would use this method if you modify the key list frequently (i.e. if you get more conflicts than is acceptable), or if the key list is so large that transferring the document is unacceptably slow. Relationship documents enable frequent changes with less chance of conflict; however, you can access neither the contact nor group information in one request. You must re-request those specific documents by ID, keeping in mind that they may change or be deleted in the interim. (The above paragraph is first-drafty; and I hasten to add it to the wiki without input from others; but I think it identifies the CouchDB considerations more clearly.) [1]: http://groups.google.com/group/google-appengine/browse_thread/thread/d5f4dcb7d00ed4c6?pli=1 -- Jason Smith Proven Corporation Bangkok, Thailand http://www.proven-corporation.com