couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Newson (JIRA)" <>
Subject [jira] Commented: (COUCHDB-271) preventing compaction from ruining the OS block cache
Date Sun, 16 Aug 2009 21:02:14 GMT


Robert Newson commented on COUCHDB-271:

Another way to approach this is to eliminate wholesale database writing to achieve compaction.

Specifically, instead of a single file for a couchdb database it would be an ordered sequence
of files. It's still append-only, so earlier files will contain data thats been superceded
by updates, etc, just as they do today. Each file is eligible to be compacted separately by
reading all the extant records from it and writing them to the end of the current file, the
old file is then deleted. With this approach (c.f. Berkeley JE), compaction could be an ongoing
background task, would not require 100% as much disk space as the database itself, and the
current inability to swap to the .compact file in the presence of constant writes would also
be addressed.

> preventing compaction from ruining the OS block cache
> -----------------------------------------------------
>                 Key: COUCHDB-271
>                 URL:
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>    Affects Versions: 0.8.1, 0.9
>            Reporter: Jan Lehnardt
>             Fix For: 0.10
> Adam Kocolosk:
> Hi, I've noticed that compacting large DBs pretty much kills any filesystem caching benefits
for CouchDB.  I believe the problem is that the OS (Linux 2.6.21 kernel in my case) is caching
blocks from the .compact file, even though those blocks won't be read again until compaction
has finished.  In the meantime, the portion of the cache dedicated to the old DB file shrinks
and performance really suffers.
> I think a better mode of operation would be to advise/instruct the OS not to cache any
portion of the .compact file until we're ready to replace the main DB.  On Linux, specifying
the POSIX_FADV_DONTNEED option to posix_fadvise() seems like the way to go:
> This link has a little more detail and a usage example:
> Of course, POSIX_FADV_DONTNEED isn't really available from inside the Erlang VM.  Perhaps
the simplest approach would be to have a helper process that we can spawn which calls that
function (or its equivalent on a non-Linux OS) periodically during compaction?  I'm not really
sure, but I wanted to get this out on the list for discussion.  Best,

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message