hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashu Pachauri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17057) Minor compactions should also drop page cache behind reads
Date Wed, 30 Nov 2016 23:52:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710244#comment-15710244

Ashu Pachauri commented on HBASE-17057:

Talked to [~eclark] offline, it turns out that throttling compactions has nothing to do with
dropping page cache, it was used as a hint to figure out the total size of the files involved
in a compaction request. Since, in the old world, compactions piggybacked on the store file
scanners that were already open, we considered it more efficient to not drop pages during
compactions that were small enough rather than potentially dropping pages for storefiles that
were probably already being read. However, since we use private readers for compactions by
default, we should drop pages for minor compactions by default.
I'll add a patch that introduces a config to drop page cache for minor and major compactions.
This config will be set to true by default, but someone who is not using private readers can
choose to turn it off (though I doubt turning it off will be any positive impact especially
in large clusters.)
For master branch, this jira will address correctly passing the drop cache hint; I'll open
a separate issue (or find one if it already exists) that makes sure we honor the hint in the
compaction path.

> Minor compactions should also drop page cache behind reads
> ----------------------------------------------------------
>                 Key: HBASE-17057
>                 URL: https://issues.apache.org/jira/browse/HBASE-17057
>             Project: HBase
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Ashu Pachauri
>            Assignee: Ashu Pachauri
> Long compactions currently drop cache behind reads/writes so that they don't pollute
the page cache but short compactions don't do that. The bulk of the data is actually read
during minor compactions instead of major compactions,  and thrashes the page cache since
it's mostly not needed. 
> We should drop page cache behind minor compactions too. 

This message was sent by Atlassian JIRA

View raw message