couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jan Lehnardt (JIRA)" <j...@apache.org>
Subject [jira] Commented: (COUCHDB-271) preventing compaction from ruining the OS block cache
Date Sat, 28 Feb 2009 09:04:12 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677666#action_12677666
] 

Jan Lehnardt commented on COUCHDB-271:
--------------------------------------

Damien Katz:

The problem is we don't get access to the low level apis or flags passed in to the OS unless
Erlang chooses to expose it. We have similar problems with compaction on windows because we
need special flags to give us unix file semantics.

To fix this, we'll either need the Erlang VM changed or use our own Erlang file driver interface.

--

Oh yeah, one more option that is kind of crazy is to spawn a small external child process
for file io. It would be a very small simple process that opens a file and responds to read/write
commands from the erlang server. Then we can implement exactly the low level apis and caching
behavior desired. The cost is extra IPC, but that should be small compare the the cost of
a blown file cache.


> preventing compaction from ruining the OS block cache
> -----------------------------------------------------
>
>                 Key: COUCHDB-271
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-271
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>    Affects Versions: 0.8.1, 0.9
>            Reporter: Jan Lehnardt
>
> Adam Kocolosk:
> Hi, I've noticed that compacting large DBs pretty much kills any filesystem caching benefits
for CouchDB.  I believe the problem is that the OS (Linux 2.6.21 kernel in my case) is caching
blocks from the .compact file, even though those blocks won't be read again until compaction
has finished.  In the meantime, the portion of the cache dedicated to the old DB file shrinks
and performance really suffers.
> I think a better mode of operation would be to advise/instruct the OS not to cache any
portion of the .compact file until we're ready to replace the main DB.  On Linux, specifying
the POSIX_FADV_DONTNEED option to posix_fadvise() seems like the way to go:
> http://linux.die.net/man/2/posix_fadvise
> This link has a little more detail and a usage example:
> http://insights.oetiker.ch/linux/fadvise.html
> Of course, POSIX_FADV_DONTNEED isn't really available from inside the Erlang VM.  Perhaps
the simplest approach would be to have a helper process that we can spawn which calls that
function (or its equivalent on a non-Linux OS) periodically during compaction?  I'm not really
sure, but I wanted to get this out on the list for discussion.  Best,

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message