incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Universal Binary JSON in CouchDB
Date Tue, 04 Oct 2011 23:42:10 GMT
On Tue, Oct 4, 2011 at 3:18 PM, Robert Newson <rnewson@apache.org> wrote:
> -1
>

Such a Debbie Downer.

> Supporting multiple formats on disk would be a very difficult code
> change that would complicate every part of the system, I don't think
> it's worth it.
>

Its not necessarily multiple formats, just one that we might be able
to serve (almost) directly to clients. Obviously this is hard and has
quite a few caveats if we decide to change away from Erlang's external
term format. But as it is, ubjson is basically the same thing as
Erlang's external term format just not Erlang specific.

If there's a possibility of it making a difference I see no reason to
investigate it. But I maintain that such a change would be quite large
and impact a large portion of the code base. So if there is a change
to be proposed someone will have to champion it, write it, test it,
and then convince everyone else that its worth it.

> If we were to contemplate just multiple http payload formats, I would
> rather support one with broader acceptance (and with the caveat that
> it would have to have some compelling reason beyond being just another
> format). I'm aware of Tim's work on messagepack but I believe it's run
> aground for the technical reasons I alluded to above.
>

Not sure what you point the allusion was too. MessagePack is nice but
lacks some features that would be required by behaviors for CouchDB.
Only because Tim suggested MessagePack did I know to suggest things
like a noop type and unbounded container lengths.

> Bottom line: I'd focus on optimizing the JSON encode/decode layer
> first before considering anything as dramatic as this. Paul Davis
> wrote a very fast JSON encoder/decoder called 'jiffy'. I would like to
> hear more about that.
>

I have. I think I have a very subtle bug cause I saw a single segfault
once so I haven't pushed to hard on getting it into trunk before other
people test it.

I think this goes back to Tim's talk though and my initial reaction to
MessagePack. I'm sure that its probably faster and is definitely
smaller than the corresponding JSON. And I can probably show that by
writing hand optimized encoder/decoder pairs for both. The issue is
that we can't support an encoder for every client language. So if
there's a reasonable spec that makes it easier for Ada or BrainFuck to
parse more efficiently and doesn't upset the internals too greatly,
then I see no reason to investigate.

> B.
>
> On 4 October 2011 21:08, Benoit Chesneau <bchesneau@gmail.com> wrote:
>> On Tue, Oct 4, 2011 at 9:33 PM, Paul Davis <paul.joseph.davis@gmail.com> wrote:
>>> For a first step I'd prefer to see a patch that makes the HTTP
>>> responses choose a content type based on accept headers. Once we see
>>> what that looks like and how/if it changes performance then *maybe* we
>>> can start talking about on disk formats. Changing how we store things
>>> on disk is a fairly high impact change that we'll need to consider
>>> carefully.
>>
>> +1
>>>
>>> That said, the ubjson spec is starting to look reasonable and capable
>>> to be an alternative content-type produced by CouchDB. If someone were
>>> to write a patch I'd review it quite enthusiastically.
>>>
>>>
>>
>> I think I would prefer to use protobuffs format though. Anyway if wwe
>> change the api to handle all types that would be pluggable without
>> problem.
>>
>> - benoƮt
>>
>

Mime
View raw message