hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17215) Separate small/large file delete threads in HFileCleaner to accelerate hfile cleanup speed
Date Wed, 29 Mar 2017 08:36:42 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946733#comment-15946733

Anoop Sam John commented on HBASE-17215:

This is related to issue here and for consideration.  Here the  HFileCleaner helps to delete
the old archived HFiles. (Mostly coming as a result of compaction)
We have CompactedHFilesDischarger chore service running in every RS now which will check the
possibility for moving out a compacted out file into archive file. Its not like immediately
after the compaction, files will get moved to archive.   Pls see CompactedHFilesDischarger.
 And the configs

> Separate small/large file delete threads in HFileCleaner to accelerate hfile cleanup
> ------------------------------------------------------------------------------------------
>                 Key: HBASE-17215
>                 URL: https://issues.apache.org/jira/browse/HBASE-17215
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Yu Li
>            Assignee: Yu Li
> When using PCIe-SSD the flush speed will be really quick, and although we have per CF
flush, we still have the {{hbase.regionserver.optionalcacheflushinterval}} setting and some
other mechanism to avoid data kept in memory for too long to flush small hfiles. In our online
environment we found the single thread cleaner kept cleaning earlier flushed small files while
large files got no chance, which caused disk full then many other problems.
> Deleting hfiles in parallel with too many threads will also increase the workload of
namenode, so here we propose to separate large/small hfile cleaner threads just like we do
for compaction, and it turned out to work well in our cluster.

This message was sent by Atlassian JIRA

View raw message