couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Geir Magnusson Jr." <>
Subject Re: Changing rev to _rev in view results (Was: Re: newbie question #1)
Date Thu, 01 Jan 2009 21:45:40 GMT

On Dec 31, 2008, at 7:40 PM, Antony Blakey wrote:

> On 31/12/2008, at 11:29 PM, Geir Magnusson Jr. wrote:
>> What trouble?  I think this is *exactly* what should be done - have  
>> CouchDB store documents that are :
>> {
>>   metadata : { _rev : X, _id : Y, _woogie: Z, .... anything that  
>> needs to be added in the future, like other metadata like last  
>> update date... },
>>   userdata : {  .... the document you want to store .... }
>> }
>> and then offer APIs that let you :
>> a) get to this document, for libraries and clients that know they  
>> are talking to Couch and want to manipulate at this level
>> b) return and accept the userdocument directly, for clients that  
>> just want to consume or produce  JSON data, w/o caring about the  
>> internal housekeeping
> One of the issues complicating the logic of this discussion is that  
> the document id is both metadata and, conceptually, a document member.

Well, I don't understand why it has to be. Certainly it's a  
convenience, and I wonder how much of current thinking has been  
influenced by the fact that this what people are used to.

I can understand why CDB needs a unique document identifier, and it  
certainly would be nice to have the option of having it shoved into  
the user doc on creation.  But

a) I think that I should have the choice as to what that identifier  
is  (e.g.  Configure the database to inject the couch metadata _id as  
"_couchID" or whatever...)

b) I should have the choice to not have it injected at all

So why do I think this is a problem?  The 10gen appserver auto-injects  
an id field into the JSON documents that are stored in our database,  
Mongo.  Can you guess what the key is?  Yep - "_id"

So how can I roundtrip a doc from 10gen through couch and back?  I  

I've made the same argument at 10gen - that I should be able to set  
the identifier (and that it shouldn't be in the doc in the first place).

Then, I'd just have a doc with

    _couchID : ....
    _mongoID : ....
     ... data...

(if I chose to shove the ID into the doc)

> That's why, although the purest model is to have the userdata as a  
> member within a Couch document as you suggest, this doesn't look  
> that appealing:
> {
>  metadata: {
>    id: ...
>    rev: ...
>    ...
>  }
>  data: {
>    ... the user's document ...
>  }
> }

I can see how this isn't appealing from the perspective of current  
API's, but a rethinking of this issue (_id and _rev) also warrants a  
re-thinking of the APIs to deal with this.

E.g. an API that lets me get a) the whole doc above  b) metadata only  
c) userdata only

> Furthermore, from a scalability perspective, always having the  
> metadata when you have the document, isn't a problem - the metadata  
> is constrained.

And from what I understand, it already exists in that manner, right?   
I mean, for efficiency, I'd guess that the _id, _rev and in the  
future, other metadata (like insert date, last modificationdate...)  
would be kept outside of the doc, so that they can be read and updated  
w/o having to serialized/deserialize the whole user document.

> The reverse situation of always having the data when you have the  
> metadata, is not constrained because the data is arbitrarily large.  
> IMO this means that a solution such as this:
> {
>  id: ...
>  rev: ...
>  ...
>  data: {
>    ... the user's document ...
>  }
> }
> isn't such a good idea compared to this:
> {
>  _metadata: {
>    id: ...
>    rev: ...
>  }
>  ... the user's document ...
> }

That only solves the problem in that there's only one reserved magical  
key (_metadata), but I don't think that really changes anything.  You  
still need to make sure any document you want to store in couch  
doesn't have a top-level _metadata element.

And while I don't know how couch works internally, we *are* really  
only talking about how the data is returned on an API call via the  
REST API or what I assume is an internal API for the M/R View stuff.

If you had an API that let you choose all, metaonly or useronly, you  
could not be burdened with stuff you didn't want or need.

> Unfortunately the reserved token makes the structure non-reflexive  
> without transformation, and although that's not currently an issue,  
> I can imagine it complicating certain use-cases. It makes the system  
> more complicated to reason about.
> I'm struggling to objectively evaluate this model and your reflexive  
> model - given Damien's attitude to this issue, my motivation to do  
> so is somewhat depressed :/

If you could point me to an explanation of why changing this is bad,  
I'd love to catch up on the discussion.  I assume it's a technical  


> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
> Did you hear about the Buddhist who refused Novocain during a root  
> canal?
> His goal: transcend dental medication.

View raw message