Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@couchdb.apache.org
Received-SPF: pass (athena.apache.org: domain of paul.joseph.davis@gmail.com
 designates 209.85.220.180 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CABn9xAFO7Ypsg5ZjzZ-rk5b4dK75fNjjz4-XqnM7ThifEf6PHw@mail.gmail.com>
References: 
 <CABn9xAGK0WbPhpkbxyKQo9N1KEsm=rx81V7FVDaRGPjXjncvWQ@mail.gmail.com>
 <CAJNb-9qaqqJmB0AG8dUmq1H5eijoejBMf2EZjChYtd+FK9RNfA@mail.gmail.com>
 <CAAL6JQjGmRNnUQ6xO72Nu82-p8ZLwb4srVUVaViB8tAw8XLb2A@mail.gmail.com>
 <CABn9xAFO7Ypsg5ZjzZ-rk5b4dK75fNjjz4-XqnM7ThifEf6PHw@mail.gmail.com>
From: Paul Davis <paul.joseph.davis@gmail.com>
Date: Tue, 4 Oct 2011 14:33:00 -0500
Message-ID: 
 <CAJ_m3YAm7AkJv7ao0DH5oiciYAvD0a6vD=VPmR3RSwP7veBfCg@mail.gmail.com>
Subject: Re: Universal Binary JSON in CouchDB
To: dev@couchdb.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

For a first step I'd prefer to see a patch that makes the HTTP
responses choose a content type based on accept headers. Once we see
what that looks like and how/if it changes performance then *maybe* we
can start talking about on disk formats. Changing how we store things
on disk is a fairly high impact change that we'll need to consider
carefully.

That said, the ubjson spec is starting to look reasonable and capable
to be an alternative content-type produced by CouchDB. If someone were
to write a patch I'd review it quite enthusiastically.


On Tue, Oct 4, 2011 at 2:23 PM, Riyad Kalla <rkalla@gmail.com> wrote:
> Hey Randall,
>
> This is something that Paul and I discussed on IRC. The way UBJ is writte=
n
> out looks something like this ([] blocks are just for readability):
> [o][2]
> =A0[s][4]name[s][3][bob]
> =A0[s][3][age][i][31]
>
> Couch can easily prepend or append its own dynamic content in a reply. If=
 it
> wants to prepend some information after the object header, the header wou=
ld
> need to be stored and manipulated by couch separately.
>
> For example, if I upload the doc above, Couch would want to take that roo=
t
> object header of:
> [o][2]
>
> and change it to:
> [o][4]
>
> before storing it because of the additions of _id and _rev. Actually this
> could be as simple as storing a "rootObjectCount" and have couch dynamica=
lly
> generate the root every time.
>
> 'o' represents object containers with <=3D 254 elements (1 byte for lengt=
h)
> and 'O' represents object containers with up to 2.1 billion elements (4 b=
yte
> int).
>
> If couch did that any request coming into the server might look like this=
:
> <- client request
> -- (server loads root object count)
> -> server writes back object header: [o][4]
> -- (server calculates dynamic data)
> -> server writes back dynamic content
> -> server streams raw record data straight off disk to client (no
> deserialization)
> -- (OPT: server calculates dynamic data)
> --> OPT: server streams dynamic data appended
>
> Thoughts?
>
> Best,
> Riyad
>
> P.S.> There is support in the spec for unbounded container types when cou=
ch
> doesn't know how much it is streaming back, but that isn't necessary for
> retrieving stored docs (but could be handy when responding to view querie=
s
> and other requests whose length is not known in advance)
>
> On Tue, Oct 4, 2011 at 12:02 PM, Randall Leeds <randall.leeds@gmail.com>w=
rote:
>
>> Hey,
>>
>> Thanks for this thread.
>>
>> I've been interested in ways to reduce the work from disk to client as
>> well.
>> Unfortunately, the metadata inside the document objects is variable base=
d
>> on
>> query parameters (_attachments, _revisions, _revs_info...) so the server
>> needs to decode the disk binary anyway.
>>
>> I would say this is something we should carefully consider for a 2.0 api=
. I
>> know that, for simplicity, many people really like having the underscore
>> prefixed attributes mixed in right alongside the document data, but a
>> future
>> API that separated these could really make things fly.
>>
>> -Randall
>>
>> On Wed, Sep 28, 2011 at 22:25, Benoit Chesneau <bchesneau@gmail.com>
>> wrote:
>>
>> > On Thursday, September 29, 2011, Riyad Kalla <rkalla@gmail.com> wrote:
>> > > DISCLAIMER: This looks long, but reads quickly (I hope). If you are =
in
>> a
>> > > rush,
>> > > just check the last 2 sections and see if it sounds interesting.
>> > >
>> > >
>> > > Hi everybody. I am new to the list, but a big fan of Couch and I hav=
e
>> > been
>> > > working
>> > > on something I wanted to share with the group.
>> > >
>> > > My appologies if this isn't the right venue or list ediquette... I
>> wasn't
>> > > really
>> > > sure where to start with this conversation.
>> > >
>> > >
>> > > Background
>> > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> > > With the help of the JSON spec community I've been finalizing a
>> > universal,
>> > > binary JSON format specification that offers 1:1 compatibility with
>> JSON.
>> > >
>> > > The full spec is here (http://ubjson.org/) and the quick list of typ=
es
>> > is
>> > > here
>> > > (http://ubjson.org/type-reference/). Differences with existing specs
>> and
>> > > "Why" are
>> > > all addressed on the site in the first few sections.
>> > >
>> > > The goal of the specification was first to maintain 1:1 compatibilit=
y
>> > with
>> > > JSON
>> > > (no custom data structures - like what caused BSON to be rejected in
>> > Issue
>> > > #702),
>> > > secondly to be as simple to work with as regular JSON (no complex da=
ta
>> > > structures or
>> > > encoding/decoding algorithms to implement) and lastly, it had to be
>> > smaller
>> > > than
>> > > compacted JSON and faster to generate and parse.
>> > >
>> > > Using a test doc that I see Filipe reference in a few of his issues
>> > > (http://friendpaste.com/qdfyId8w1C5vkxROc5Thf) I get the following
>> > > compression:
>> > >
>> > > * Compacted JSON: 3,861 bytes
>> > > * Univ. Binary JSON: 3,056 bytes (20% smaller)
>> > >
>> > > In some other sample data (e.g. jvm-serializers sample data) I see a
>> 27%
>> > > compression
>> > > with a typical compression range of 20-30%.
>> > >
>> > > While these compression levels are average, the data is written out =
in
>> an
>> > > unmolested
>> > > format that is optimized for read speed (no scanning for null
>> > terminators)
>> > > and criminally
>> > > simple to work with. (win-win)
>> > >
>> > > I added more clarifying information about compression characteristic=
s
>> in
>> > the
>> > > "Size Requirements"
>> > > section of the spec for anyone interested.
>> > >
>> > >
>> > > Motivation
>> > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> > > I've been following the discussions surround a native binary JSON
>> format
>> > for
>> > > the core
>> > > CouchDB file (Issue #1092) which transformed into keeping the format
>> and
>> > > utilizing
>> > > Google's Snappy (Issue #1120) to provide what looks to be roughly a
>> > 40-50%
>> > > reduction in file
>> > > size at the cost of running the compression/decompression on every
>> > > read/write.
>> > >
>> > > I realize in light of the HTTP transport and JSON encoding/decoding
>> cycle
>> > in
>> > > CouchDB, the
>> > > Snappy compression cycles are a very small part of the total time th=
e
>> > server
>> > > spends working.
>> > >
>> > > I found this all interesting, but like I said, I realized up to this
>> > point
>> > > that Snappy
>> > > wasn't any form of bottleneck and the big compression wins server si=
de
>> > were
>> > > great so I had
>> > > nothing to contribute to the conversation.
>> > >
>> > >
>> > > Catalyst
>> > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> > > This past week I watched Tim Anglade's presentation (
>> http://goo.gl/LLucD
>> > )
>> > > and started to
>> > > foam at the mouth when I saw his slides where he skipped the JSON
>> > > encode/decode cycle
>> > > server-side and just generated straight from binary on disk into
>> > MessagePack
>> > > and got
>> > > some phenomenal speedups from the server:
>> > > http://i.imgscalr.com/XKqXiLusT.png
>> > >
>> > > I pinged Tim to see what the chances of adding Univ Binary JSON supp=
ort
>> > was
>> > > and he seemed
>> > > ameanable to the idea as long as I could hand him an Erlang or Ruby
>> impl
>> > > (unfortunately,
>> > > I am not familiar with either).
>> > >
>> > >
>> > > ah-HA! moment
>> > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> > > Today it occurred to me that if CouchDB were able to (at the cost of
>> 20%
>> > > more disk space
>> > > than it is using with Snappy enabled, but still 20% *less* than befo=
re
>> > > Snappy was integrated)
>> > > use the Universal Binary JSON format as its native storage format AN=
D
>> > > support for serving replies
>> > > using the same format was added (a-la Tim's work), this would allow
>> > CouchDB
>> > > to (theoretically)
>> > > reply to queries by pulling bytes off disk (or memory) and immediate=
ly
>> > > streaming them back to
>> > > the caller with no intermediary step at all (no Snappy decompress, n=
o
>> > Erlang
>> > > decode, no JSON encode).
>> > >
>> > > Given that the Univ Binary JSON spec is standard, easy to parse and
>> > simple
>> > > to convert back to
>> > > JSON, adding support for it seemed more consistent with Couch's mott=
o
>> of
>> > > ease and simplicity
>> > > than say MessagePack or Protobuff which provide better compression b=
ut
>> at
>> > > the cost of more
>> > > complex formats and data types that have no ancillary in JSON.
>> > >
>> > > I don't know the intracacies of Couch's internals, if that is wrong =
and
>> > some
>> > > Erlang
>> > > manipulation of the data would still be required, I believe it would
>> > still
>> > > be faster to pull the data
>> > > off disk in the Univ Binary JSON format, decode to Erlang native typ=
es
>> > and
>> > > then reply while
>> > > skipping the Snappy decompression step.
>> > >
>> > > If it *would* be possible to stream it back un-touched directly from
>> > disk,
>> > > that seems like
>> > > an enhancement that could potentially speed up CouchDB by at least a=
n
>> > order
>> > > of magnitude.
>> > >
>> > >
>> > > Conclusion
>> > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D
>> > > I would appreciate any feedback on this idea from you guys with a lo=
t
>> > more
>> > > knowledge of
>> > > the internals.
>> > >
>> > > I have no problem if this is a horrible idea and never going to happ=
en,
>> I
>> > > just wanted to try
>> > > and contribute something back.
>> > >
>> > >
>> > > Thank you all for reading.
>> > >
>> > > Best wishes,
>> > > Riyad
>> > >
>> >
>> > what is universal in something new?
>> >
>> > - =A0benoit
>> >
>>
>