couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filipe David Manana <fdman...@apache.org>
Subject Re: [jira] Created: (COUCHDB-1092) Storing documents bodies as raw JSON binaries instead of serialized JSON terms
Date Wed, 16 Mar 2011 12:07:07 GMT
Thanks Dave

On Wed, Mar 16, 2011 at 10:46 AM, Dave Cottlehuber <dave@muse.net.nz> wrote:
> Filipe
>
> this looks awesome - sure lots of comments will come on this one :-)
>
> A+
> Dave
>
> On 16 March 2011 09:54, Filipe Manana (JIRA) <jira@apache.org> wrote:
>> Storing documents bodies as raw JSON binaries instead of serialized JSON terms
>> ------------------------------------------------------------------------------
>>
>>                 Key: COUCHDB-1092
>>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1092
>>             Project: CouchDB
>>          Issue Type: Improvement
>>          Components: Database Core
>>            Reporter: Filipe Manana
>>            Assignee: Filipe Manana
>>
>>
>> Currently we store documents as Erlang serialized (via the term_to_binary/1 BIF)
EJSON.
>> The proposed patch changes the database file format so that instead of storing serialized
>> EJSON document bodies, it stores raw JSON binaries.
>>
>> The github branch is at:  https://github.com/fdmanana/couchdb/tree/raw_json_docs
>>
>> Advantages:
>>
>> * what we write to disk is much smaller - a raw JSON binary can easily get up to
50% smaller
>>  (at least according to the tests I did)
>>
>> * when serving documents to a client we no longer need to JSON encode the document
body
>>  read from the disk - this applies to individual document requests, view queries
with
>>  ?include_docs=true, pull and push replications, and possibly other use cases.
>>  We just grab its body and prepend the _id, _rev and all the necessary metadata
fields
>>  (this is via simple Erlang binary operations)
>>
>> * we avoid the EJSON term copying between request handlers and the db updater processes,
>>  between the work queues and the view updater process, between replicator processes,
etc
>>
>> * before sending a document to the JavaScript view server, we no longer need to convert
it
>>  from EJSON to JSON
>>
>> The changes done to the document write workflow are minimalist - after JSON decoding
the
>> document's JSON into EJSON and removing the metadata top level fields (_id, _rev,
etc), it
>> JSON encodes the resulting EJSON body into a binary - this consumes CPU of course
but it
>> brings 2 advantages:
>>
>> 1) we avoid the EJSON copy between the request process and the database updater process
-
>>   for any realistic document size (4kb or more) this can be very expensive, specially
>>   when there are many nested structures (lists inside objects inside lists, etc)
>>
>> 2) before writing anything to the file, we do a term_to_binary([Len, Md5, TheThingToWrite])
>>   and then write the result to the file. A term_to_binary call with a binary as
the input
>>   is very fast compared to a term_to_binary call with EJSON as input (or some other
nested
>>   structure)
>>
>> I think both compensate the JSON encoding after the separation of meta data fields
and non-meta data fields.
>>
>> The following relaximation graph, for documents with sizes of 4Kb, shows a significant
>> performance increase both for writes and reads - especially reads.
>>
>> http://graphs.mikeal.couchone.com/#/graph/698bf36b6c64dbd19aa2bef63400b94f
>>
>>
>> I've also made a few tests to see how much the improvement is when querying a view,
for the
>> first time, without ?stale=ok. The size difference of the databases (after compaction)
is
>> also very significant - this change can reduce the size at least 50% in common cases.
>>
>> The test databases were created in an instance built from that experimental branch.
>> Then they were replicated into a CouchDB instance built from the current trunk.
>> At the end both databases were compacted (to fairly compare their final sizes).
>>
>> The databases contain the following view:
>> {
>>    "_id": "_design/test",
>>    "language": "javascript",
>>    "views": {
>>        "simple": {
>>            "map": "function(doc) { emit(doc.float1, doc.strings[1]); }"
>>        }
>>    }
>> }
>>
>>
>> ## Database with 500 000 docs of 2.5Kb each
>>
>> Document template is at:  https://github.com/fdmanana/couchdb/blob/raw_json_docs/doc_2_5k.json
>>
>> Sizes (branch vs trunk):
>>
>> $ du -m couchdb/tmp/lib/disk_json_test.couch
>> 1996    couchdb/tmp/lib/disk_json_test.couch
>>
>> $ du -m couchdb-trunk/tmp/lib/disk_ejson_test.couch
>> 2693    couchdb-trunk/tmp/lib/disk_ejson_test.couch
>>
>>
>> Time, from a user's perpective, to build the view index from scratch:
>>
>> $ time curl http://localhost:5984/disk_json_test/_design/test/_view/simple?limit=1
>> {"total_rows":500000,"offset":0,"rows":[
>> {"id":"0000076a-c1ae-4999-b508-c03f4d0620c5","key":null,"value":"wfxuF3N8XEK6"}
>> ]}
>>
>> real    6m6.740s
>> user    0m0.016s
>> sys     0m0.008s
>>
>> $ time curl http://localhost:5985/disk_ejson_test/_design/test/_view/simple?limit=1
>> {"total_rows":500000,"offset":0,"rows":[
>> {"id":"0000076a-c1ae-4999-b508-c03f4d0620c5","key":null,"value":"wfxuF3N8XEK6"}
>> ]}
>>
>> real    15m41.439s
>> user    0m0.012s
>> sys     0m0.012s
>>
>>
>>
>> ## Database with 100 000 docs of 11Kb each
>>
>> Document template is at:  https://github.com/fdmanana/couchdb/blob/raw_json_docs/doc_11k.json
>>
>> Sizes (branch vs trunk):
>>
>> $ du -m couchdb/tmp/lib/disk_json_test_11kb.couch
>> 1185    couchdb/tmp/lib/disk_json_test_11kb.couch
>>
>> $ du -m couchdb-trunk/tmp/lib/disk_ejson_test_11kb.couch
>> 2202    couchdb-trunk/tmp/lib/disk_ejson_test_11kb.couch
>>
>>
>> Time, from a user's perpective, to build the view index from scratch:
>>
>> $ time curl http://localhost:5984/disk_json_test_11kb/_design/test/_view/simple?limit=1
>> {"total_rows":100000,"offset":0,"rows":[
>> {"id":"00001511-831c-41ff-9753-02861bff73b3","key":null,"value":"2fQUbzRUax4A"}
>> ]}
>>
>> real    4m19.306s
>> user    0m0.008s
>> sys     0m0.004s
>>
>> $ time curl http://localhost:5985/disk_ejson_test_11kb/_design/test/_view/simple?limit=1
>> {"total_rows":100000,"offset":0,"rows":[
>> {"id":"00001511-831c-41ff-9753-02861bff73b3","key":null,"value":"2fQUbzRUax4A"}
>> ]}
>>
>> real    18m46.051s
>> user    0m0.008s
>> sys     0m0.016s
>>
>>
>>
>> All in all, I haven't seen yet any disadvantage with this approach. Also, the code
changes
>> don't bring additional complexity. I say the performance and disk space gains it
gives are
>> very positive.
>>
>> This branch still needs to be polished in a few places. But I think it isn't far
from getting mature.
>>
>> Other experiments that can be done are to store view values as raw JSON binaries
as well (instead of EJSON)
>> and optional compression of the stored JSON binaries (since it's pure text, the compression
ratio is very high).
>> However, I would prefer to do these other 2 suggestions in separate branches/patches
- I haven't actually tested
>> any of them yet, so maybe they not bring significant gains.
>>
>> Thoughts? :)
>>
>>
>> --
>> This message is automatically generated by JIRA.
>> For more information on JIRA, see: http://www.atlassian.com/software/jira
>>
>



-- 
Filipe David Manana,
fdmanana@gmail.com, fdmanana@apache.org

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

Mime
View raw message