On Dec 31, 2008, at 7:40 PM, Antony Blakey wrote: > > On 31/12/2008, at 11:29 PM, Geir Magnusson Jr. wrote: > >> What trouble? I think this is *exactly* what should be done - have >> CouchDB store documents that are : >> >> { >> metadata : { _rev : X, _id : Y, _woogie: Z, .... anything that >> needs to be added in the future, like other metadata like last >> update date... }, >> userdata : { .... the document you want to store .... } >> } >> >> and then offer APIs that let you : >> >> a) get to this document, for libraries and clients that know they >> are talking to Couch and want to manipulate at this level >> >> b) return and accept the userdocument directly, for clients that >> just want to consume or produce JSON data, w/o caring about the >> internal housekeeping > > One of the issues complicating the logic of this discussion is that > the document id is both metadata and, conceptually, a document member. Well, I don't understand why it has to be. Certainly it's a convenience, and I wonder how much of current thinking has been influenced by the fact that this what people are used to. I can understand why CDB needs a unique document identifier, and it certainly would be nice to have the option of having it shoved into the user doc on creation. But a) I think that I should have the choice as to what that identifier is (e.g. Configure the database to inject the couch metadata _id as "_couchID" or whatever...) b) I should have the choice to not have it injected at all So why do I think this is a problem? The 10gen appserver auto-injects an id field into the JSON documents that are stored in our database, Mongo. Can you guess what the key is? Yep - "_id" So how can I roundtrip a doc from 10gen through couch and back? I can't. I've made the same argument at 10gen - that I should be able to set the identifier (and that it shouldn't be in the doc in the first place). Then, I'd just have a doc with { _couchID : .... _mongoID : .... ... data... } (if I chose to shove the ID into the doc) > That's why, although the purest model is to have the userdata as a > member within a Couch document as you suggest, this doesn't look > that appealing: > > { > metadata: { > id: ... > rev: ... > ... > } > data: { > ... the user's document ... > } > } I can see how this isn't appealing from the perspective of current API's, but a rethinking of this issue (_id and _rev) also warrants a re-thinking of the APIs to deal with this. E.g. an API that lets me get a) the whole doc above b) metadata only c) userdata only > > > Furthermore, from a scalability perspective, always having the > metadata when you have the document, isn't a problem - the metadata > is constrained. And from what I understand, it already exists in that manner, right? I mean, for efficiency, I'd guess that the _id, _rev and in the future, other metadata (like insert date, last modificationdate...) would be kept outside of the doc, so that they can be read and updated w/o having to serialized/deserialize the whole user document. > The reverse situation of always having the data when you have the > metadata, is not constrained because the data is arbitrarily large. > IMO this means that a solution such as this: > > { > id: ... > rev: ... > ... > data: { > ... the user's document ... > } > } > > isn't such a good idea compared to this: > > { > _metadata: { > id: ... > rev: ... > } > ... the user's document ... > } That only solves the problem in that there's only one reserved magical key (_metadata), but I don't think that really changes anything. You still need to make sure any document you want to store in couch doesn't have a top-level _metadata element. And while I don't know how couch works internally, we *are* really only talking about how the data is returned on an API call via the REST API or what I assume is an internal API for the M/R View stuff. If you had an API that let you choose all, metaonly or useronly, you could not be burdened with stuff you didn't want or need. > > Unfortunately the reserved token makes the structure non-reflexive > without transformation, and although that's not currently an issue, > I can imagine it complicating certain use-cases. It makes the system > more complicated to reason about. > > I'm struggling to objectively evaluate this model and your reflexive > model - given Damien's attitude to this issue, my motivation to do > so is somewhat depressed :/ If you could point me to an explanation of why changing this is bad, I'd love to catch up on the discussion. I assume it's a technical reason? geir > > > Antony Blakey > ------------- > CTO, Linkuistics Pty Ltd > Ph: 0438 840 787 > > Did you hear about the Buddhist who refused Novocain during a root > canal? > His goal: transcend dental medication. > >