Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 50117 invoked from network); 15 Mar 2011 22:50:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Mar 2011 22:50:53 -0000 Received: (qmail 72112 invoked by uid 500); 15 Mar 2011 22:50:53 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 72069 invoked by uid 500); 15 Mar 2011 22:50:53 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 72061 invoked by uid 99); 15 Mar 2011 22:50:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Mar 2011 22:50:53 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Mar 2011 22:50:51 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id B9DA53AB92E for ; Tue, 15 Mar 2011 22:50:30 +0000 (UTC) Date: Tue, 15 Mar 2011 22:50:30 +0000 (UTC) From: "Filipe Manana (JIRA)" To: dev@couchdb.apache.org Message-ID: <1400036114.5344.1300229430757.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1530361489.4886.1300222469693.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] Commented: (COUCHDB-1092) Storing documents bodies as raw JSON binaries instead of serialized JSON terms MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/COUCHDB-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007252#comment-13007252 ] Filipe Manana commented on COUCHDB-1092: ---------------------------------------- Paul, > 3. That relaximation graph does not look very impressive, but I've been saying for a long time that those are bordering on pointless if the difference isn't an order of magnitude. Are you saying that performance improvements are only worth if relaximation (or whatever tool) shows an order of magnitude gain? I seriously doubt that even with a major rewrite of CouchDB one can get an order of magnitude of performance for document writes (reads maybe only with some super efficient caching). As for 4), while I mostly agree, I haven't thought yet of a simpler way of doing it without adding radical changes like passing the metadata in headers or whatever. The JSON manipulation is fairly simple, take out the opening { of the body and prepend the metadata. Generating the metadata can be done in a more elegant way than I did - simply JSON encoding an EJSON object , converting it to a binary, removing the trailing } and prepending the result to the body. I'm naturally open to other suggestions. Again, keep in mind it's a work in progress :) > Storing documents bodies as raw JSON binaries instead of serialized JSON terms > ------------------------------------------------------------------------------ > > Key: COUCHDB-1092 > URL: https://issues.apache.org/jira/browse/COUCHDB-1092 > Project: CouchDB > Issue Type: Improvement > Components: Database Core > Reporter: Filipe Manana > Assignee: Filipe Manana > > Currently we store documents as Erlang serialized (via the term_to_binary/1 BIF) EJSON. > The proposed patch changes the database file format so that instead of storing serialized > EJSON document bodies, it stores raw JSON binaries. > The github branch is at: https://github.com/fdmanana/couchdb/tree/raw_json_docs > Advantages: > * what we write to disk is much smaller - a raw JSON binary can easily get up to 50% smaller > (at least according to the tests I did) > * when serving documents to a client we no longer need to JSON encode the document body > read from the disk - this applies to individual document requests, view queries with > ?include_docs=true, pull and push replications, and possibly other use cases. > We just grab its body and prepend the _id, _rev and all the necessary metadata fields > (this is via simple Erlang binary operations) > * we avoid the EJSON term copying between request handlers and the db updater processes, > between the work queues and the view updater process, between replicator processes, etc > * before sending a document to the JavaScript view server, we no longer need to convert it > from EJSON to JSON > The changes done to the document write workflow are minimalist - after JSON decoding the > document's JSON into EJSON and removing the metadata top level fields (_id, _rev, etc), it > JSON encodes the resulting EJSON body into a binary - this consumes CPU of course but it > brings 2 advantages: > 1) we avoid the EJSON copy between the request process and the database updater process - > for any realistic document size (4kb or more) this can be very expensive, specially > when there are many nested structures (lists inside objects inside lists, etc) > 2) before writing anything to the file, we do a term_to_binary([Len, Md5, TheThingToWrite]) > and then write the result to the file. A term_to_binary call with a binary as the input > is very fast compared to a term_to_binary call with EJSON as input (or some other nested > structure) > I think both compensate the JSON encoding after the separation of meta data fields and non-meta data fields. > The following relaximation graph, for documents with sizes of 4Kb, shows a significant > performance increase both for writes and reads - especially reads. > http://graphs.mikeal.couchone.com/#/graph/698bf36b6c64dbd19aa2bef63400b94f > I've also made a few tests to see how much the improvement is when querying a view, for the > first time, without ?stale=ok. The size difference of the databases (after compaction) is > also very significant - this change can reduce the size at least 50% in common cases. > The test databases were created in an instance built from that experimental branch. > Then they were replicated into a CouchDB instance built from the current trunk. > At the end both databases were compacted (to fairly compare their final sizes). > The databases contain the following view: > { > "_id": "_design/test", > "language": "javascript", > "views": { > "simple": { > "map": "function(doc) { emit(doc.float1, doc.strings[1]); }" > } > } > } > ## Database with 500 000 docs of 2.5Kb each > Document template is at: https://github.com/fdmanana/couchdb/blob/raw_json_docs/doc_2_5k.json > Sizes (branch vs trunk): > $ du -m couchdb/tmp/lib/disk_json_test.couch > 1996 couchdb/tmp/lib/disk_json_test.couch > $ du -m couchdb-trunk/tmp/lib/disk_ejson_test.couch > 2693 couchdb-trunk/tmp/lib/disk_ejson_test.couch > Time, from a user's perpective, to build the view index from scratch: > $ time curl http://localhost:5984/disk_json_test/_design/test/_view/simple?limit=1 > {"total_rows":500000,"offset":0,"rows":[ > {"id":"0000076a-c1ae-4999-b508-c03f4d0620c5","key":null,"value":"wfxuF3N8XEK6"} > ]} > real 6m6.740s > user 0m0.016s > sys 0m0.008s > $ time curl http://localhost:5985/disk_ejson_test/_design/test/_view/simple?limit=1 > {"total_rows":500000,"offset":0,"rows":[ > {"id":"0000076a-c1ae-4999-b508-c03f4d0620c5","key":null,"value":"wfxuF3N8XEK6"} > ]} > real 15m41.439s > user 0m0.012s > sys 0m0.012s > ## Database with 100 000 docs of 11Kb each > Document template is at: https://github.com/fdmanana/couchdb/blob/raw_json_docs/doc_11k.json > Sizes (branch vs trunk): > $ du -m couchdb/tmp/lib/disk_json_test_11kb.couch > 1185 couchdb/tmp/lib/disk_json_test_11kb.couch > $ du -m couchdb-trunk/tmp/lib/disk_ejson_test_11kb.couch > 2202 couchdb-trunk/tmp/lib/disk_ejson_test_11kb.couch > Time, from a user's perpective, to build the view index from scratch: > $ time curl http://localhost:5984/disk_json_test_11kb/_design/test/_view/simple?limit=1 > {"total_rows":100000,"offset":0,"rows":[ > {"id":"00001511-831c-41ff-9753-02861bff73b3","key":null,"value":"2fQUbzRUax4A"} > ]} > real 4m19.306s > user 0m0.008s > sys 0m0.004s > $ time curl http://localhost:5985/disk_ejson_test_11kb/_design/test/_view/simple?limit=1 > {"total_rows":100000,"offset":0,"rows":[ > {"id":"00001511-831c-41ff-9753-02861bff73b3","key":null,"value":"2fQUbzRUax4A"} > ]} > real 18m46.051s > user 0m0.008s > sys 0m0.016s > All in all, I haven't seen yet any disadvantage with this approach. Also, the code changes > don't bring additional complexity. I say the performance and disk space gains it gives are > very positive. > This branch still needs to be polished in a few places. But I think it isn't far from getting mature. > Other experiments that can be done are to store view values as raw JSON binaries as well (instead of EJSON) > and optional compression of the stored JSON binaries (since it's pure text, the compression ratio is very high). > However, I would prefer to do these other 2 suggestions in separate branches/patches - I haven't actually tested > any of them yet, so maybe they not bring significant gains. > Thoughts? :) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira