hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "huaxiang sun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17215) Separate small/large file delete threads in HFileCleaner to accelerate archived hfile cleanup speed
Date Fri, 31 Mar 2017 15:56:41 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15951178#comment-15951178
] 

huaxiang sun commented on HBASE-17215:
--------------------------------------

{quote}
Please carefully check and confirm whether this is caused by pressure on Namenode, and if
so the change here might worsen it (more requests in parallel, although not that much). And
good luck(smile).
{quote}

Thanks [~carp84] for the great advice. We checked the name node, its workload is not heavy.
Still investigating why it takes 120 ms to delete one file as tracelog seems to tell us it
should be much faster. 

Thanks for the patch! We will apply this patch and HBASE-17854 to see how much is improved.

> Separate small/large file delete threads in HFileCleaner to accelerate archived hfile
cleanup speed
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-17215
>                 URL: https://issues.apache.org/jira/browse/HBASE-17215
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Yu Li
>            Assignee: Yu Li
>         Attachments: HBASE-17215.patch, HBASE-17215.v2.patch, HBASE-17215.v3.patch
>
>
> When using PCIe-SSD the flush speed will be really quick, and although we have per CF
flush, we still have the {{hbase.regionserver.optionalcacheflushinterval}} setting and some
other mechanism to avoid data kept in memory for too long to flush small hfiles. In our online
environment we found the single thread cleaner kept cleaning earlier flushed small files while
large files got no chance, which caused disk full then many other problems.
> Deleting hfiles in parallel with too many threads will also increase the workload of
namenode, so here we propose to separate large/small hfile cleaner threads just like we do
for compaction, and it turned out to work well in our cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message