hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cosmin Lehene (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-6351) Stop compactions from polluting OS FS cache
Date Wed, 07 Jan 2015 22:36:35 GMT

     [ https://issues.apache.org/jira/browse/HBASE-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Cosmin Lehene updated HBASE-6351:
    Component/s: Performance

> Stop compactions from polluting OS FS cache 
> --------------------------------------------
>                 Key: HBASE-6351
>                 URL: https://issues.apache.org/jira/browse/HBASE-6351
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>            Reporter: Ted Yu
> The following came from Otis via http://search-hadoop.com/m/MGVqgZJ4Mj2 :
> Lucene 4.0.0-Alpha was recently released.  Mike McCandless, sne of the Lucene developers,
wrote a really nice post about new things in this version of Lucene.  The part that I think
is interesting for HBase, and that HBase devs may want to look at (and borrow to use with
compactions) is this:
> Reducing merge IO impact 
> Merging (consolidating many small segments into a single big one) is a very IO and CPU
intensive operation which can easily interfere with ongoing searches. In 4.0.0 we now have
two ways to reduct this impact:
>         * Rate-limit the IO caused by ongoing merging, by calling FSDirectory.setMaxMergeWriteMBPerSec.

>         * Use the new NativeUnixDirectory which bypasses the OS's IO cache for all merge
IO, by using direct IO. This ensures that a merge won't evict hot pages used by searches.
(Note that there is also a native WindowsDirectory, but it does not yet use direct IO during
merging... patches welcome!). 
> Remember to also set swappiness to 0 on Linux if you want to maximize search responsiveness.

> More generally, the APIs that open an input or output file (Directory.openInput and Directory.createOutput)
now take an IOContext describing what's being done (e.g., flush vs merge), so you can create
a custom Directory that changes its behavior depending on the context. 

This message was sent by Atlassian JIRA

View raw message