couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Campbell <rrc...@gmail.com>
Subject Re: How do you handle multiple document groups?
Date Fri, 06 Nov 2009 06:02:23 GMT
> Can I ask what the advantage of this is? Is this for replication? I like
> having typed databases; it seems like that will be an easy way to solve
> scaling problems.

For me it's just about grouping. According to CouchDB: The Definitive
Guide, a CouchDB database is "...a bucket that holds 'related data'."

So in my example Blog application, I see two groups of related data:
1) all the data for a particular domain (posts, comments, etc)
2) all the data which is semantically similar (all posts grouped
together, all comments grouped together, all users grouped together,
etc)

My problem is that CouchDB really only allows 1 level of grouping.
This means I either have to simulate multiple groups by using some
database naming conversion or I have to just pick one group (as Jan
suggested, domain) and leave documents of all different semantic types
just piled in together (post docs, comment docs, user docs all over
the place). Of course this isn't as bad as it sounds, because I can
make a View to sort them all out.

As to Adam's question, I haven't really though about
replication/scaling yet; I'm just a couple chapters into the book I
quoted and I'm still playing with the 0.8.0 version sitting on my
Ubuntu (why isn't there a 0.9.0 deb?).

I'm still not sure which of the two options I'll select:
myapp_com/posts, myapp_com/comments databases or just myapp_com
database with all types lying within + Views to sort them out. Either
way it should be fun :-)



On Thu, Nov 5, 2009 at 8:12 PM, Adam Wolff <awolff@gmail.com> wrote:
> Can I ask what the advantage of this is? Is this for replication? I like
> having typed databases; it seems like that will be an easy way to solve
> scaling problems.
>
> A
>
> On Thu, Nov 5, 2009 at 3:03 AM, Jan Lehnardt <jan@apache.org> wrote:
>
>>
>> On 5 Nov 2009, at 11:47, Robert Campbell wrote:
>>
>>  Okay, I _do_ like that CouchDB lets me use "/" in a database name, so
>>> I can hopefully do "http://xxx/myapp_com/users" which feels better
>>> thanks to the / delimiter.
>>>
>>> For something like "myapp.com" domain and "users" database. I'll just
>>> have to replace all periods with underscores or something.
>>>
>>
>> I wouldn't create databases per type.
>>
>> say user a has the domains foo.com and bar.com
>>
>> for that user the databases /a/foo_com/ and /a/bar_com
>> are created. In these databases, all documents for the
>> respective domain live. If you need additional info for
>> the user that owns the domain that is not specific to
>> the domain, I'd go with putting all user-specific info
>> in each of the user's databases. This is duplication,
>> but it doesn't really hurt, except maybe for users
>> with hundreds and thousands of domains. In which
>> case he/she probably pays you enough money to
>> solve it :)
>>
>> Cheers
>> Jan
>> --
>>
>>
>>
>>
>>>
>>> On Thu, Nov 5, 2009 at 11:19 AM, Jan Lehnardt <jan@apache.org> wrote:
>>>
>>>>
>>>> On 5 Nov 2009, at 11:02, Robert Campbell wrote:
>>>>
>>>>  First, let me say that I posted this question to StackOverflow here:
>>>>> http://stackoverflow.com/questions/1674662/nested-databases-in-couchdb
>>>>>
>>>>> Here is what I mean by multiple document groups:
>>>>> Assume you are building a blog engine which services hundreds of
>>>>> domains. Within each domain, you would have similar groups of
>>>>> documents: users, posts, comments, etc. How would you structure this
>>>>> in CouchDB?
>>>>>
>>>>> One way would be to create a User database which contains users from
>>>>> all domains. The user documents would have a "domain" field which
>>>>> denotes which domain this user is valid on. Likewise, you would have
a
>>>>> single Post database, with each post document having a domain field
>>>>> and so on. I don't like this solution because 1) you will have lots of
>>>>> data duplication, where every single document has to denote the domain
>>>>> it's connected to. 2) It seems like it could be a security problem.
>>>>> One vulnerability in your view functions could accidentally return one
>>>>> domain's user set to another, etc. If we change the blog engine into
>>>>> an enterprise document management/workflow engine, you could have
>>>>> serious problems exposing one document to a competitor.
>>>>>
>>>>> Another way you could do it is by bringing the domain group up into
>>>>> the database level. This means you'd have "MyApp.com-Users",
>>>>> "MyApp.com-Posts", "MyApp.com-Comments", "AnotherApp.net-Users", etc.
>>>>> This helps reduce the data duplication and maybe the security issue a
>>>>> bit, because now your documents don't need to specify a "domain" field
>>>>> everywhere. Your app would follow the simple naming convention to
>>>>> select the proper database. The disadvantage of this is that it feels
>>>>> like a hack. What I really want is a MyApp database which contains
>>>>> User, Post, and Comment sub-databases (for example) which then contain
>>>>> the documents for that top-level group (domain) and lower-level group
>>>>> (posts).
>>>>>
>>>>> How do you guys address this problem?
>>>>>
>>>>
>>>> Give each user/domain combo a separate database. Lot's of databases are
>>>> no
>>>> problem.
>>>>
>>>> Cheers
>>>> Jan
>>>> --
>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message