hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhoushuaifeng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3969) Outdated data can not be cleaned in time
Date Fri, 17 Jun 2011 02:06:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050857#comment-13050857

zhoushuaifeng commented on HBASE-3969:

I agree Changes in add() method is required. Because when flush a file, it will triger a compact,
and it should renew the priority of the region if it's file count increase.
And this:
  public int getCompactPriority() {
    int count = Integer.MAX_VALUE;
    for(Store store : stores.values()) {
      count = Math.min(count, store.getCompactPriority());
    return count;
I think may be Math.max may be better. In case some store have too much files and blocked.
To boost the priority of the request after specified amount of time has elapsed is a good
solution, but I'm afraid it will impact the regions needs compaction more as described in
my comments 16/Jun/11 07:14

> Outdated data can not be cleaned in time
> ----------------------------------------
>                 Key: HBASE-3969
>                 URL: https://issues.apache.org/jira/browse/HBASE-3969
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.90.1, 0.90.2, 0.90.3
>            Reporter: zhoushuaifeng
>             Fix For: 0.90.4
>         Attachments: HBASE-3969-solution1-for-branch.patch, HBASE-3969-solution1.patch
> Compaction checker will send regions to the compact queue to do compact. But the priority
of these regions is too low if these regions have only a few storefiles. When there is large
through output, and the compact queue will aways have some regions with higher priority. This
may causing the major compact be delayed for a long time(even a few days),  and outdated data
cleaning will also be delayed.
> In our test case, we found some regions sent to the queue by major compact checker hunging
in the queue for more than 2 days! Some scanners on these regions cannot get availably data
for a long time and lease expired.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message