couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: [RESULT]: Accept newline patch into CouchDB for 0.9 (Was: Re: VOTE: accept newline patch into CouchDB for 0.9)
Date Tue, 24 Feb 2009 19:13:17 GMT
On Tue, Feb 24, 2009 at 1:13 PM, Damien Katz <damien@apache.org> wrote:
> I'll once again state my objection to the newlines, which is actually kind
> of weak.
>
> If we compute the revids deterministically (hash the canonical doc
> contents), then when we return the document back to the client, we can send
> as an integrity hash the same revid, because it is already pre-computed and
> stored, etc. What it could save us is the CPU cycles of computing the hash.
> I think we also get some nice free caching benefits too, but I'm not sure.
> But if we do, it might even save us the disk reads to get the doc to compute
> the hash. The problem is any standardized canonical representation is
> unlikely to included a newline at the end.
>
> Now I'm not even sure this scheme is workable either way, or only workable
> in very special instances which are too rare to be worth it. But if the
> scheme works, then it can simplify the code and make things more efficient,
> which are 2 very good things. However these benefits may never come, and
> we'll not have the newlines anyway. That would suck.
>
> But the problem if we just add the newlines, then later remove them,
> production apps and scripts that rely on that will break and make the change
> is very painful. Or impossible.
>
> So, those are the issues as I see them.
>
> Now the more I think about it, the more I think that unless we move all
> couchdb metadata to the http header, my ideas won't work. Moving everything
> meta to the header is a big change that has some supporters, but someone
> would need to do the work before it could even be considered.
>
> -Damien
>

I'm a fan of the no-metadata-in-documents concept, but there are some
issues both philosophical and practical. Philosophically speaking, as
pointed out by the HTTP headers thread, we may be abusing headers when
we consider some of the more CouchDB specific concepts, I doubt that
there's an existing header for everything we'd need.

Secondly _attachments and _rev_info are unbounded. I know there are
limits to the number of headers in a request I can only assume that
some clients might have limits for responses.

The only thought I had that would satisfy most of the interesting bits
I've come up against would be to have two response versions: the raw
document body as we have now (minus metadata obviously) that includes
the very basic _id and _rev in the headers (I'm assuming there are
appropriate headers for these). And a second version that is a
multipart mime message that has parts corresponding to the doc body,
the longer metadata like _revs_info and then one part per attachment.
Including the different parts could be optional. And so far that's
missing some stuff like listing attachment info without getting the
entire body.

The real kicker is how do we support clients lacking HTTP-fu. For
instance, a quick google [1] suggests that XHR probably isn't capable
of dealing with multipart messages. There's an obvious middle ground
that could allow different versions to be returned via URL parameters
though, and then maybe provide the "all content as multipart mime" as
an option.

Anyway, that's about as far as I've thought through the different issues.

HTH,
Paul Davis

[1] http://groups.google.com/group/mozilla.dev.tech.xml/browse_thread/thread/e1599de6fc31f2e8

> On Feb 24, 2009, at 12:30 PM, Chris Anderson wrote:
>
>> I go to sleep for 8 hours, and this is the thanks I get! ;)
>>
>> But on a more serious note, I think we should pull a hedge fund move,
>> (or maybe quantum entanglement?) and add to the newline patch, some
>> lines that would change the color of the CouchDB logo from red to
>> blue.
>>
>> OK actually - I have a new opinion about the newlines stuff. Since I
>> really don't care all that much, and I don't see a canonical JSON
>> format happening anytime soon, I'm fine with returning newlines at the
>> end of our responses.
>>
>> Some implementation notes:
>>
>> I haven't looked at the patch lately, but I know that there are lots
>> of little places in the code that it will have to touch.
>>
>> Also, and we haven't discussed this nearly as much as we probably
>> should, the implementation of JSONP would be quite similar. To
>> implement JSONP we'll need to do something like this:
>>
>> USER_SPECIFIED_CALLBACK_NAME + "(" + CouchDB's JSON response + ");"
>>
>> So it's like the newline at the end patch, but also at the beginnings...
>>
>> That is all.
>>
>> Chris
>>
>> --
>> Chris Anderson
>> http://jchris.mfdz.com
>
>

Mime
View raw message