hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhoushuaifeng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3969) Outdated data can not be cleaned in time
Date Thu, 16 Jun 2011 07:14:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050250#comment-13050250
] 

zhoushuaifeng commented on HBASE-3969:
--------------------------------------

Hi St, the 3rd solution may be not so good. For the regions in the queue, regions have files
number close to or more than blockingStoreFiles should have the most higher priority, because
if not, flush will be blocked and impact put. The secondary important is regions that need
major compact to clean outdated data, reason has mentioned in this issue. But regions with
few files(for example, only reach the compactionThreshold), and should do a minor compact
should have the lowest priority, it does no matter how these regions hanging in the queue.

So, I think may be setting the major compact priority to a proper value(between 1 and blockingStoreFiles
- compactionThreshold) may be a better choice. How do you think?

> Outdated data can not be cleaned in time
> ----------------------------------------
>
>                 Key: HBASE-3969
>                 URL: https://issues.apache.org/jira/browse/HBASE-3969
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.90.1, 0.90.2, 0.90.3
>            Reporter: zhoushuaifeng
>             Fix For: 0.90.4
>
>         Attachments: HBASE-3969-solution1-for-branch.patch, HBASE-3969-solution1.patch
>
>
> Compaction checker will send regions to the compact queue to do compact. But the priority
of these regions is too low if these regions have only a few storefiles. When there is large
through output, and the compact queue will aways have some regions with higher priority. This
may causing the major compact be delayed for a long time(even a few days),  and outdated data
cleaning will also be delayed.
> In our test case, we found some regions sent to the queue by major compact checker hunging
in the queue for more than 2 days! Some scanners on these regions cannot get availably data
for a long time and lease expired.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message