couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Goodall <>
Subject Re: batch=ok for bulk_docs and single doc implementation concerns
Date Thu, 15 Apr 2010 09:52:01 GMT

Thanks for the time you spent explaining things. I should have traced
into the code a little further I guess ;-). I've just created a ticket
for the removal of the two batch_save config options.

- Matt

On 14 April 2010 15:02, Adam Kocoloski <> wrote:
> On Apr 14, 2010, at 9:38 AM, Matt Goodall wrote:
>> On 14 April 2010 13:23, Adam Kocoloski <> wrote:
>>> On Apr 14, 2010, at 7:59 AM, Matt Goodall wrote:
>>>> Hi,
>>>> Over in couchdb-python land someone wanted to use batch=ok when
>>>> creating and updating documents, so we added support.
>>>> I was semi-surprised to notice that _bulk_docs does not support
>>>> batch=ok. I realise _bulk_docs essentially is a batch update but a
>>>> _bulk_docs batch=ok would presumably allow CouchDB to buffer more in
>>>> memory before writing to disk. What are your thoughts?
>>> Its probably of limited utility.  If you're already batching on the client side,
you can achieve the same effect by sending in a larger batch.  I'm not opposed to it per
se, just don't think it will help with throughput all that much.
>> :nod: given the new behaviour I'm inclined to agree.
>>>> Now, this buffering is where the "implementation concerns" come in.
>>>> According to the wiki:
>>>> "There is a query option batch=ok which can be used to achieve higher
>>>> throughput at the cost of lower guarantees. When a PUT (or a document
>>>> POST as described below) is sent using this option, it is not
>>>> immediately written to disk. Instead it is stored in memory on a
>>>> per-user basis for a second or so (or the number of docs in memory
>>>> reaches a certain point). After the threshold has passed, the docs are
>>>> committed to disk."
>>>> However, unless I'm missing something (quite likely ;-)), there is no
>>>> "stored in memory on a per-user basis" or any check for when "the
>>>> number of docs in memory reaches a certain point". All it seems to do
>>>> is spawn a new process so the update happens when the Erlang scheduler
>>>> gets around to it. In fact, I don't see any reference to the
>>>> batch_save_interval and batch_save_size configuration options in the
>>>> code.
>>> The wiki describes the 0.10 implementation of batch=ok.  In 0.11 batch mode
takes advantage of the fact that couch_db_updater always merges all waiting updates to a DB
into a single write, and so doesn't bother with the separate set of supervised processes accumulating
documents.  In effect the 0.11 batch=ok is "I'm not going to wait around, but save this as
soon as you get a chance".
>> Ah, I didn't dig far enough into the code to see that happening.
>> So, purely for my understanding, it's now simplified to a delayed
>> commit that happens at most 1000ms after normal changes are received.
>> Anything that causes the commit to happen earlier cancels the pending
>> commit.
>> Does that mean that batch="ok" with delayed_commits=false is meaningless?
> So, we should distinguish between writes and fsyncs.  CouchDB 0.11 never waits to write;
if there is an update_docs message in couch_db_updater's mailbox it acts on that "immediately"
(that is, as soon as it finishes whatever else it's doing at the moment).  Moreover, it batches
together all the update_docs messages in its mailbox and does one write operation.  At the
end of this write operation the modified pages may not yet be flushed to disk, in fact they
almost certainly are not.  The kernel is caching them for a period of time.  That's where
fsync comes in.
> The delayed_commits setting controls the frequency with which CouchDB writes the DB header
and calls fsync.  If it is set to false, CouchDB syncs the file as soon as it completes a
write operation.  A write operation can be a single document update, or it can update multiple
documents in the case of concurrent writer threads, batch=ok, or _bulk_docs requests.  If
delayed_commits is set to true, CouchDB syncs the file at 1 second intervals (if an update
to the file has occurred in that interval, of course).
> batch=ok with delayed_commits=false is not quite meaningless, but you're right, you probably
won't sneak too many updates into a single commit unless fsync is really slow.  One example
is OS X, where Erlang's file:sync calls a different fcntl which actually forces the hard disk
to flush the data to spinning platters.  It's super-slow but more reliable than regular-old
fsync, which just gets the data from the kernel to the hard disk's cache.  If you have a
non-volatile disk cache on your Linux server that's cool, but a regular old consumer hard
drive in your MacBook does not have that luxury.
>> Anyway, it sounds like the two batch_save config options should be
>> removed from etc/couchdb/
> Yes.
>>> This does change the performance characteristics quite a bit; in particular,
when the underlying disk is fast the new batch=ok behavior will result in significantly larger
uncompacted databases.
>> Agh, this suggests I didn't understand the updater's behaviour. Large
>> uncompacted database normally means lots of small additions to the
>> database file. How does fast disk speed affect that?
> All I meant there was that if the disk is slow, you can dump a bunch of messages into
couch_db_updater's mailbox while it's talking to the disk.  When it finishes what its doing
and looks in the mailbox, it'll batch everything in the mailbox together for the next write
op.  This results in a somewhat smaller DB file.  If the disk is fast couch_db_updater's
mailbox will be mostly empty, and it'll be doing a larger number of smaller operations.  Best,
> Adam
>>>> Shouldn't batch=ok send the doc off to some background process that
>>>> accumulates docs until either the batch interval or size threshold has
>>>> been reached? This would also ensure that batch=ok updates are handled
>>>> in the order they arrive, although I'm not sure if that matters given
>>>> that the user has basically said they don't care if it succeeds or not
>>>> by using batch=ok.
>>> I think the documents updates are still handled in the order in which they were
>>>> - Matt
>>> Best, Adam

View raw message