couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Joseph Davis (Commented) (JIRA)" <>
Subject [jira] [Commented] (COUCHDB-1342) Asynchronous file writes
Date Fri, 18 Nov 2011 00:56:52 GMT


Paul Joseph Davis commented on COUCHDB-1342:


> Robert, the inflight batching of writes is limited to 1 meg per database.

No, its up to 1 meg per file that's being written to. It's also important to note that the
buffering isn't actually a passive thing like is generally done. The "buffer" is actually
just the mailbox for the writer_loop process. The queued_bytes_len or whatever is just counting
how much data has been submitted to to the process that hasn't been acked to prevent blowing
the top of that mailbox (which is quite reasonable).

The writer isn't really buffering anything itself, its just leaning on Erlang's message passing
internals to be that buffer (which is quite reasonable). Then all the writer_loop does is
accept messages and respond to the parent couch_file gen_server. If it happens to find multiple
write messages in the mailbox consecutively at the same time, it'll write those in a single
call to file:write/2.

I would not be at all surprised if it were shown that the bulk of the improvement from this
patch is due to this specific part of the patch. For the curious, the zip_server test at [1]
tests something quite similar to this setup.

> Asynchronous file writes
> ------------------------
>                 Key: COUCHDB-1342
>                 URL:
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>         Attachments: COUCHDB-1342.patch
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at
> Github Commit:

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message