Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 75068 invoked from network); 6 Nov 2009 06:02:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Nov 2009 06:02:56 -0000 Received: (qmail 55280 invoked by uid 500); 6 Nov 2009 06:02:55 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 55205 invoked by uid 500); 6 Nov 2009 06:02:54 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 55195 invoked by uid 99); 6 Nov 2009 06:02:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Nov 2009 06:02:54 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of rrc7cz@gmail.com designates 72.14.220.153 as permitted sender) Received: from [72.14.220.153] (HELO fg-out-1718.google.com) (72.14.220.153) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Nov 2009 06:02:43 +0000 Received: by fg-out-1718.google.com with SMTP id 22so68282fge.5 for ; Thu, 05 Nov 2009 22:02:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=sqnIerlgnzguLDco8RWMc1b6+MMhCrVfXsJAPlAv9jY=; b=E9EHTQP6NARqelgdEwFgNv45w+4xST+O+FjKSc4MFrpfybkKV6w7Z69bc9TMu8Y+Dm 0LmOfE/9W6EOdjz0fi4HMcXRTCj95tunCsXMArMMmUofsj05SzPKJOgStLCqQGvLYsKM DdDeyFj6yzCgG2qRtd6SV1z/w8NU45AP2m2uU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=Wkiuuo3OfXR3jb2VLgm/Akr47j2cJaNbnfMlccjYbtbYjwcg+Cyb3iWUAn7cNhs45w NQf930MmruGZyYMlsVpVadSmJWF3DjKqA2H6PwrgI8BBVS868xiqchAGL0usZNGFht/1 I/zqlY+JN5w1/oaRiyAZqcWzlQp37k4Gw3o8I= MIME-Version: 1.0 Received: by 10.204.162.143 with SMTP id v15mr4138939bkx.50.1257487343083; Thu, 05 Nov 2009 22:02:23 -0800 (PST) In-Reply-To: References: <5324CCF9-9806-482B-B30C-B07F1340F077@apache.org> Date: Fri, 6 Nov 2009 07:02:23 +0100 Message-ID: Subject: Re: How do you handle multiple document groups? From: Robert Campbell To: user@couchdb.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org > Can I ask what the advantage of this is? Is this for replication? I like > having typed databases; it seems like that will be an easy way to solve > scaling problems. For me it's just about grouping. According to CouchDB: The Definitive Guide, a CouchDB database is "...a bucket that holds 'related data'." So in my example Blog application, I see two groups of related data: 1) all the data for a particular domain (posts, comments, etc) 2) all the data which is semantically similar (all posts grouped together, all comments grouped together, all users grouped together, etc) My problem is that CouchDB really only allows 1 level of grouping. This means I either have to simulate multiple groups by using some database naming conversion or I have to just pick one group (as Jan suggested, domain) and leave documents of all different semantic types just piled in together (post docs, comment docs, user docs all over the place). Of course this isn't as bad as it sounds, because I can make a View to sort them all out. As to Adam's question, I haven't really though about replication/scaling yet; I'm just a couple chapters into the book I quoted and I'm still playing with the 0.8.0 version sitting on my Ubuntu (why isn't there a 0.9.0 deb?). I'm still not sure which of the two options I'll select: myapp_com/posts, myapp_com/comments databases or just myapp_com database with all types lying within + Views to sort them out. Either way it should be fun :-) On Thu, Nov 5, 2009 at 8:12 PM, Adam Wolff wrote: > Can I ask what the advantage of this is? Is this for replication? I like > having typed databases; it seems like that will be an easy way to solve > scaling problems. > > A > > On Thu, Nov 5, 2009 at 3:03 AM, Jan Lehnardt wrote: > >> >> On 5 Nov 2009, at 11:47, Robert Campbell wrote: >> >> =C2=A0Okay, I _do_ like that CouchDB lets me use "/" in a database name,= so >>> I can hopefully do "http://xxx/myapp_com/users" which feels better >>> thanks to the / delimiter. >>> >>> For something like "myapp.com" domain and "users" database. I'll just >>> have to replace all periods with underscores or something. >>> >> >> I wouldn't create databases per type. >> >> say user a has the domains foo.com and bar.com >> >> for that user the databases /a/foo_com/ and /a/bar_com >> are created. In these databases, all documents for the >> respective domain live. If you need additional info for >> the user that owns the domain that is not specific to >> the domain, I'd go with putting all user-specific info >> in each of the user's databases. This is duplication, >> but it doesn't really hurt, except maybe for users >> with hundreds and thousands of domains. In which >> case he/she probably pays you enough money to >> solve it :) >> >> Cheers >> Jan >> -- >> >> >> >> >>> >>> On Thu, Nov 5, 2009 at 11:19 AM, Jan Lehnardt wrote: >>> >>>> >>>> On 5 Nov 2009, at 11:02, Robert Campbell wrote: >>>> >>>> =C2=A0First, let me say that I posted this question to StackOverflow h= ere: >>>>> http://stackoverflow.com/questions/1674662/nested-databases-in-couchd= b >>>>> >>>>> Here is what I mean by multiple document groups: >>>>> Assume you are building a blog engine which services hundreds of >>>>> domains. Within each domain, you would have similar groups of >>>>> documents: users, posts, comments, etc. How would you structure this >>>>> in CouchDB? >>>>> >>>>> One way would be to create a User database which contains users from >>>>> all domains. The user documents would have a "domain" field which >>>>> denotes which domain this user is valid on. Likewise, you would have = a >>>>> single Post database, with each post document having a domain field >>>>> and so on. I don't like this solution because 1) you will have lots o= f >>>>> data duplication, where every single document has to denote the domai= n >>>>> it's connected to. 2) It seems like it could be a security problem. >>>>> One vulnerability in your view functions could accidentally return on= e >>>>> domain's user set to another, etc. If we change the blog engine into >>>>> an enterprise document management/workflow engine, you could have >>>>> serious problems exposing one document to a competitor. >>>>> >>>>> Another way you could do it is by bringing the domain group up into >>>>> the database level. This means you'd have "MyApp.com-Users", >>>>> "MyApp.com-Posts", "MyApp.com-Comments", "AnotherApp.net-Users", etc. >>>>> This helps reduce the data duplication and maybe the security issue a >>>>> bit, because now your documents don't need to specify a "domain" fiel= d >>>>> everywhere. Your app would follow the simple naming convention to >>>>> select the proper database. The disadvantage of this is that it feels >>>>> like a hack. What I really want is a MyApp database which contains >>>>> User, Post, and Comment sub-databases (for example) which then contai= n >>>>> the documents for that top-level group (domain) and lower-level group >>>>> (posts). >>>>> >>>>> How do you guys address this problem? >>>>> >>>> >>>> Give each user/domain combo a separate database. Lot's of databases ar= e >>>> no >>>> problem. >>>> >>>> Cheers >>>> Jan >>>> -- >>>> >>>> >>>> >>> >> >