couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Volker Mische <volker.mis...@gmail.com>
Subject Re: [PROPOSAL] new underscore namespacing
Date Wed, 18 Dec 2013 16:39:24 GMT
On 12/03/2013 07:12 PM, Benoit Chesneau wrote:
> On Tue, Dec 3, 2013 at 3:01 PM, Benjamin Young <byoung@bigbluehat.com>wrote:
> 
>> Hi all,
>>
>> Recently the "doc._*" reservation has been causing me trouble when pulling
>> in "arbitrary" JSON from various sources that also use the underscore
>> prefixed names for things (HAL [1], vnd.error [2], other APIs). I've also
>> hit the wall several times when trying to import filesystem contents
>> (Sphinx, ghpages, and the like) that use _* prefixing for their "special
>> folders."
>>
>> As such, I'd like to propose the following:
>> 1. Begin storing new reserved terms in doc._.* (rather than doc._*).
>>  - this gives developers one object to look into for the meta-data about a
>> doc
>>  - you can see the scope creep of our current doc._* best in the
>> replicator status messages.
>>     - doc._ replication_* would become doc._.replication.*
>> 2. Move "magic" API endpoints under "/_/" term as well (for the sake of
>> attachments.
>>  - _design/doc would stay the same
>>  - but the child endpoints would live under "_design/doc/_/*"
>>     - _design/doc/_/view/by_date
>>     - _design/doc/_/list/by_date/ul
>>     - _design/doc/_/rewrite
>>
>> I realize these are extreme API shifts, and would need to wait for CouchDB
>> 2.0.
>>
>> The first steps this direction would be to put new reserved word keys into
>> a "doc._.*" namespace going forward. Closer to the "cut over" for 2.0
>> duplicates of the existing keys (doc._id, doc._rev, especially) could also
>> live at their new underscore prefixed names (doc._.id, doc._.rev) which
>> would give devs a chance to migrate code and content.
>>
>> Doing this would:
>> 1. Give us "limitless" space to add content.
>> 2. Encourage a namespacing pattern for things like doc._.replication.* or
>> other logically grouped content.
>> 3. Free up CouchDB to accept a far broader range of content and remove the
>> "hey, you can't put that there! I was here first!" errors. :)
>>
>> Thanks for considering this,
>> Benjamin
>>
>> [1] http://stateless.co/hal_specification.html
>> [2] https://github.com/blongden/vnd.error
>>
> 
> I don't see why couchdb should adapt itself to newer things that didn't
> take care of an older API when doing their stuff but that's probably
> another concern ;)
> 
> I would find a "/_/" in the URL rather ugly and not needed in that case.
> Same for having a _ in a doc.  also it doesn't have much sense. Why do you
> want to change the HTTP api at that level?
> 
> Another way to do it and probably more restish woudl be moving all couchdb
> resources in their own namespace. Say `couchdb/` for example. so anything
> in the resource couchdb will be related to couchdb.
> 
> Next is the the prefix "_" in the doc. It's actually reserved because
> sometimes, once day we will add other metadata which is fine. But raises
> the issue you have.
> 
> If I summarise the discussion here amd precedent discussions there are
> different school there:
> 
> - remove the metadata from the doc and put them in headers or aside. I
> quite like the first solution, though it may be a problem behind some
> proxies, or with the header length (especially for json values). Also
> headers are supposed to be in latin1 in a lot of clients...
> - put the metadata in their own namespace which is what you propose.
> 
> I dislike the last solution. Mostly because it would force the clients to
> wait this namespace to read the metadata while parsing the JSON (which
> could be when streaming it). Instead I would prefer to keep them at the
> first level and due the reverse: put the data in their own namespace, say
> `_data`. This allows any clients to ignore this layer if needed while
> parsing the JSON and get it directly (without parsing  then). The metadata
> should be the first citizem imo. Optionally we could add some new
> parameters to the doc api allowing someone to only fetch the metadata,
> etc.. Also couchdb could also parse the coming doc and stop to parse the
> json when seeing this property and store it directly. It is also following
> the logic of attachments somehow. Another things that could be done at the
> api level is having smth like `/db/docid/_data` which would allows you to
> only retrieve the data instead of using a show function.
> 
> What do you think?
> 
> - benoit

Hi all,

I've been talking with Benoit about this at the CouchHack. I think his
proposal makes a lot of sense. Let's take the separation of meta and the
document body (as I proposed) together with what Benoit said.

When storing the actual data in a top-level property called "_data", you
could easily extract the meta information, without parsing the body at
all. You just need to parse all the top level properties (which you need
to do anyway as JSON doesn't have any distinct sorting).

Having this could be a great first step towards making meta and document
body separation easier to implement.

In a next step you could then e.g. provide an API where you just send
the document body, with the meta as headers.

Cheers,
  Volker


Mime
View raw message