couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Joseph Davis (JIRA)" <j...@apache.org>
Subject [jira] Commented: (COUCHDB-1023) Batching writes of BTree nodes (when possible) and in the DB updater
Date Wed, 12 Jan 2011 01:40:46 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980495#action_12980495
] 

Paul Joseph Davis commented on COUCHDB-1023:
--------------------------------------------

In theory the btree update is fine. I'm not entirely familiar with that part of the db updater
code so I can't comment with any authority on that section. I trust that its not any more
crazy than just changing enough code to enable multiple writes and what not.

One comment I do have is that I would prefer that the couch_file api is more straight forward.
For instance, the btree code has to do its own term_to_binary call when you could just create
a couch_file:append_terms/2 method that would do that which would make things a bit more clean
in client code.

In a one off comment, I'm still contemplating extending the fd NIF to not break the scheduler
which may make some of these sorts of "optimizations" as moot. Depending on the severity of
the snowacalypse tomorrow I may have the day off and this sounds like something I might work
on.

> Batching writes of BTree nodes (when possible) and in the DB updater
> --------------------------------------------------------------------
>
>                 Key: COUCHDB-1023
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1023
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Filipe Manana
>
> Recently I started experimenting with batching writes in the DB updater.
> For a test of 100 writers of 1Kb documents for e.g., most often the updater collects
between 20 and 30 documents to write.
> Currently it does a file:write operation for each one. Not only this is slower, but it
implies more context switches and stressing the OS/filesystem by allocating few blocks very
often (since we use a pure file append write mode). The same can be done in the BTree node
writes.
> The following branch/patch, is an experiment of batching writes:
> https://github.com/fdmanana/couchdb/compare/batch_writes
> In couch_file there's a quick test method that compares the time taken to write X blocks
of size Y versus writing a single block of size X * Y.
> Example:
> Eshell V5.8.2  (abort with ^G)
> 1> Apache CouchDB 1.2.0aa777195-git (LogLevel=info) is starting.
> Apache CouchDB has started. Time to relax.
> [info] [<0.37.0>] Apache CouchDB has started on http://127.0.0.1:5984/
> 1> couch_file:test(1000, 30).
> multi writes of 30 binaries, each of size 1000 bytes, took 1920us
> batch write of 30 binaries, each of size 1000 bytes,  took 344us
> ok
> 2> 
> 2> couch_file:test(4000, 30).
> multi writes of 30 binaries, each of size 4000 bytes, took 2002us
> batch write of 30 binaries, each of size 4000 bytes,  took 700us
> ok
> 3> 
> One order of magnitude less is quite significant I would say.
> Lower response times are mostly noticeable when delayed_commits are set to true.
> Running a writes only test with this branch gave me:
> http://graphs.mikeal.couchone.com/#/graph/8bf31813eef7c0b7e37d1ea25902e544
> While with trunk I got:
> http://graphs.mikeal.couchone.com/#/graph/8bf31813eef7c0b7e37d1ea25902eb50
> These tests were done on Linux with ext4 (and OTP R14B01).
> However I'm still not 100% sure if this worth applying to trunk.
> Any thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message