Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 93879 invoked from network); 31 Aug 2009 00:51:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 31 Aug 2009 00:51:24 -0000 Received: (qmail 15841 invoked by uid 500); 31 Aug 2009 00:51:23 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 15742 invoked by uid 500); 31 Aug 2009 00:51:23 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 15732 invoked by uid 99); 31 Aug 2009 00:51:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 31 Aug 2009 00:51:23 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of hallettj@gmail.com designates 209.85.217.221 as permitted sender) Received: from [209.85.217.221] (HELO mail-gx0-f221.google.com) (209.85.217.221) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 31 Aug 2009 00:51:12 +0000 Received: by gxk21 with SMTP id 21so4962793gxk.3 for ; Sun, 30 Aug 2009 17:50:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=jak6/VRDc94u4+fKTrE5I5UF0oapxGudrihIhY0socc=; b=HqG7a7OJkpwiae+/vGqa7N0ztGW9poMGFOSrvXGbHmr9HuFqUHiQCL4urvtSNot5oS UBURZtjq12HtYtJz0v7Y5c5NpobpD/AZ/3iRXXDMKSD304JgSslTLFVyJapGBMp0qEtd 7znLyPOoi/zRo6q32uBxt9LBcZjg/7fDJncsk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=C/j2G9p95f3Oga2Dnhmfv4rhJmEsZzseNHElvkYJvAxvai96Q8uWJfhBDzT2zUcpxm U4NmHbBx72GcQg6To/tIdC3fIwQliV81QTtlCDTWld2dwxIJJQKoJ5v5EgpLT6YHy7J4 uonzrWq1mL45cDlWBS1PrA+f+8Zc5vQzCpIDg= MIME-Version: 1.0 Received: by 10.151.5.21 with SMTP id h21mr7869021ybi.26.1251679851080; Sun, 30 Aug 2009 17:50:51 -0700 (PDT) In-Reply-To: References: <4A9AD394.20608@sinesignal.com> <7B8A7655-B750-4B0B-93BB-A48FE8C47A9E@bzero.se> <4A9B070C.1030207@sinesignal.com> <20090831001002.GA555@pb.local> From: Jesse Hallett Date: Sun, 30 Aug 2009 17:50:31 -0700 Message-ID: <8a02878f0908301750g620a0cd0t7a62242a26a7926@mail.gmail.com> Subject: Re: Would there be a problem with storing documents with this structure? To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Sun, Aug 30, 2009 at 5:21 PM, Chris Anderson wrote: > On Sun, Aug 30, 2009 at 5:10 PM, Tom Sante wrote: >> On Sun, Aug 30, 19:11, Dale Ragan wrote: >>> > >>> >>Basically I have a document, with an id, rev, type, and Content >>> >>keys. =A0The Content key >>> >>holds the serialized object that is to be stored for it's value. >>> >>Are there any pitfalls >>> >>with this design? =A0I have attached a sample below: >>> >I should say I'm in no way an expert, I'm starting to wrap my head >>> >around document modelling myself. I've been reading up on couchdb >>> >a couple of days now and find it really interesting. >>> > >>> >Anyway, on to your document. First, why duplicate the manager id? >>> >Isn't there a risk of them getting out of sync? >>> There is no chance that the Id's will get out of sync. I handle >>> generating the Id's when the object is persisted for the first time. >>> > >>> >I think you will run into many conflicts if subordinates are >>> >updated independently. Each subordinate has an id, is there >>> >another document with more information about subordinates? In that >>> >case, why not have all information in there and connect them with >>> >a managerId attribute instead? >>> This is just an example object that I modeled up for the post. >>> Subordinates in this case are updated another way. =A0They are just >>> referenced by the Manager object. =A0Basically, a one-to-many >>> relationship. =A0If you wanted to update one, you would use a document >>> that wrapped the Worker object. =A0Is it better to normalize the data >>> even in CouchDB? >>> >>> I am new to CouchDB also. =A0I am trying to abstract any need for a >>> domain model needing to know about CouchDB's terms, like Rev. =A0I am >>> writing an API in a statically typed language and I am experimenting >>> with the best way to store the object that is given to my API. =A0This >>> design helps and is one of the few I have come up with. Putting serialized data inside a 'Content' attribute is a good way to go. I have seen the same pattern recommended elsewhere. It lets you serialize arbitrary data without having collisions with metadata; specifically the '_id', '_rev', and 'type' attributes. And map functions can pull any indexable data out of nested attributes, so I don't think this approach has any particular performance implications. >>> >>{ >>> >> =A0"|_id|":|"000144df-6f11-49f1-a502-e0dab3592326"|, >>> >> =A0"|_rev|":|"1-308931e16105b566e1fb48106c85116e"|, >>> >> =A0"|type|":|"Manager"|, >>> >> =A0"|Content|": { >>> >> =A0 =A0 =A0"|Subordinates|": [ >>> >> =A0 =A0 =A0 =A0 =A0{ >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Address|": { >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Street|":|"123 Somewhere St."|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"|City|":|"Kalamazoo"|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"|State|":|"MI"|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Zip|":|"12345"| >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0}, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Hours|":|40|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Id|":|"6bcdea2f-2439-4785-ab59-2ee61243= 5705"|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Name|":|"Bob"|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Login|":|"bbob"| >>> >> =A0 =A0 =A0 =A0 =A0}, >>> >> =A0 =A0 =A0 =A0 =A0{ >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Address|": { >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Street|":|"123 Somewhere St."|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"|City|":|"Kalamazoo"|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"|State|":|"MI"|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Zip|":|"12345"| >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0}, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Hours|":|40|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Id|":|"b0d156c9-ea3f-4c4f-b49d-ab19bff6= 4dd8"|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Name|":|"Alice"|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Login|":|"aalice"| >>> >> =A0 =A0 =A0 =A0 =A0}, >>> >> =A0 =A0 =A0 =A0 =A0{ >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Address|": { >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Street|":|"123 Somewhere St."|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"|City|":|"Kalamazoo"|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"|State|":|"MI"|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Zip|":|"12345"| >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0}, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Hours|":|20|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Id|":|"12b6dbbc-44e8-43c2-8142-11fc6c1d= 23df"|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Name|":|"Eve"|, >>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0"|Login|":|"eeve"| >>> >> =A0 =A0 =A0 =A0 =A0} >>> >> =A0 =A0 =A0], >>> >> =A0 =A0 =A0"|Id|":|"000144df-6f11-49f1-a502-e0dab3592326"|, >>> >> =A0 =A0 =A0"|Name|":|"6"|, >>> >> =A0 =A0 =A0"|Login|":|"6-login"| >>> >> =A0} >>> >>} >>> >> >>> >>Basically the content is a Manager type object with an Id, Name, >>> >>Login, and Subordinates. >>> >>Subordinates are Worker's with an Id, Name, Login, Hours, and an >>> >>Address. =A0The _id and the Id of >>> >>the Manager object are the same. =A0Basically the Document object >>> >>is just a wrapper around what is >>> >>given to be persisted. >>> >> >>> >>Thanks, >>> >> >>> >>Dale >> >> Like Martin said why all this duplication? >> Give each worker it's own document and only add the id's of the >> workers as subordinates. So you can change worker details without >> having to change the manager document. > > if you put the manager_id on the worker, then you can pull out a > manager and all it's workers in a single query if you like, using just > a map view. > > here's the canonical write up of the technique: > > http://www.cmlenz.net/archives/2007/10/couchdb-joins > >> >> It might even be better to only store the managers own info in the >> manager doc and save any worker-manager relations in the respective >> worker document by referencing the manager id in the worker doc + how >> many hours he worked for that manager. >> This makes it easier if a worker changes to work for another manager you >> just reference the manager id in worker doc still keeping the history >> of previous other managers that worker had in the past.