On Sun, Aug 30, 2009 at 5:21 PM, Chris Anderson wrote: > On Sun, Aug 30, 2009 at 5:10 PM, Tom Sante wrote: >> On Sun, Aug 30, 19:11, Dale Ragan wrote: >>> > >>> >>Basically I have a document, with an id, rev, type, and Content >>> >>keys.  The Content key >>> >>holds the serialized object that is to be stored for it's value. >>> >>Are there any pitfalls >>> >>with this design?  I have attached a sample below: >>> >I should say I'm in no way an expert, I'm starting to wrap my head >>> >around document modelling myself. I've been reading up on couchdb >>> >a couple of days now and find it really interesting. >>> > >>> >Anyway, on to your document. First, why duplicate the manager id? >>> >Isn't there a risk of them getting out of sync? >>> There is no chance that the Id's will get out of sync. I handle >>> generating the Id's when the object is persisted for the first time. >>> > >>> >I think you will run into many conflicts if subordinates are >>> >updated independently. Each subordinate has an id, is there >>> >another document with more information about subordinates? In that >>> >case, why not have all information in there and connect them with >>> >a managerId attribute instead? >>> This is just an example object that I modeled up for the post. >>> Subordinates in this case are updated another way.  They are just >>> referenced by the Manager object.  Basically, a one-to-many >>> relationship.  If you wanted to update one, you would use a document >>> that wrapped the Worker object.  Is it better to normalize the data >>> even in CouchDB? >>> >>> I am new to CouchDB also.  I am trying to abstract any need for a >>> domain model needing to know about CouchDB's terms, like Rev.  I am >>> writing an API in a statically typed language and I am experimenting >>> with the best way to store the object that is given to my API.  This >>> design helps and is one of the few I have come up with. Putting serialized data inside a 'Content' attribute is a good way to go. I have seen the same pattern recommended elsewhere. It lets you serialize arbitrary data without having collisions with metadata; specifically the '_id', '_rev', and 'type' attributes. And map functions can pull any indexable data out of nested attributes, so I don't think this approach has any particular performance implications. >>> >>{ >>> >>  "|_id|":|"000144df-6f11-49f1-a502-e0dab3592326"|, >>> >>  "|_rev|":|"1-308931e16105b566e1fb48106c85116e"|, >>> >>  "|type|":|"Manager"|, >>> >>  "|Content|": { >>> >>      "|Subordinates|": [ >>> >>          { >>> >>              "|Address|": { >>> >>                  "|Street|":|"123 Somewhere St."|, >>> >>                  "|City|":|"Kalamazoo"|, >>> >>                  "|State|":|"MI"|, >>> >>                  "|Zip|":|"12345"| >>> >>              }, >>> >>              "|Hours|":|40|, >>> >>              "|Id|":|"6bcdea2f-2439-4785-ab59-2ee612435705"|, >>> >>              "|Name|":|"Bob"|, >>> >>              "|Login|":|"bbob"| >>> >>          }, >>> >>          { >>> >>              "|Address|": { >>> >>                  "|Street|":|"123 Somewhere St."|, >>> >>                  "|City|":|"Kalamazoo"|, >>> >>                  "|State|":|"MI"|, >>> >>                  "|Zip|":|"12345"| >>> >>              }, >>> >>              "|Hours|":|40|, >>> >>              "|Id|":|"b0d156c9-ea3f-4c4f-b49d-ab19bff64dd8"|, >>> >>              "|Name|":|"Alice"|, >>> >>              "|Login|":|"aalice"| >>> >>          }, >>> >>          { >>> >>              "|Address|": { >>> >>                  "|Street|":|"123 Somewhere St."|, >>> >>                  "|City|":|"Kalamazoo"|, >>> >>                  "|State|":|"MI"|, >>> >>                  "|Zip|":|"12345"| >>> >>              }, >>> >>              "|Hours|":|20|, >>> >>              "|Id|":|"12b6dbbc-44e8-43c2-8142-11fc6c1d23df"|, >>> >>              "|Name|":|"Eve"|, >>> >>              "|Login|":|"eeve"| >>> >>          } >>> >>      ], >>> >>      "|Id|":|"000144df-6f11-49f1-a502-e0dab3592326"|, >>> >>      "|Name|":|"6"|, >>> >>      "|Login|":|"6-login"| >>> >>  } >>> >>} >>> >> >>> >>Basically the content is a Manager type object with an Id, Name, >>> >>Login, and Subordinates. >>> >>Subordinates are Worker's with an Id, Name, Login, Hours, and an >>> >>Address.  The _id and the Id of >>> >>the Manager object are the same.  Basically the Document object >>> >>is just a wrapper around what is >>> >>given to be persisted. >>> >> >>> >>Thanks, >>> >> >>> >>Dale >> >> Like Martin said why all this duplication? >> Give each worker it's own document and only add the id's of the >> workers as subordinates. So you can change worker details without >> having to change the manager document. > > if you put the manager_id on the worker, then you can pull out a > manager and all it's workers in a single query if you like, using just > a map view. > > here's the canonical write up of the technique: > > http://www.cmlenz.net/archives/2007/10/couchdb-joins > >> >> It might even be better to only store the managers own info in the >> manager doc and save any worker-manager relations in the respective >> worker document by referencing the manager id in the worker doc + how >> many hours he worked for that manager. >> This makes it easier if a worker changes to work for another manager you >> just reference the manager id in worker doc still keeping the history >> of previous other managers that worker had in the past.