hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3969) Outdated data can not be cleaned in time
Date Thu, 16 Jun 2011 17:49:48 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050594#comment-13050594

stack commented on HBASE-3969:

@ zhoushuaifeng I take it you are doing lots of deletes or you are doing lots of aging out
of old versions?  Do you think the ycsb represents what your actual loading will be like?

You have identified an issue with our scheme where-by we assign a priority on initial queuing
and while the priority may have been correct at the time, circumstances change over time.
 Its as though the priority should change with as the situation evolves?  If something has
been queued a long time, its priority should go up?  Perhaps go up only if a major compaction?
  This would require us adding something to peek at queues on a period.

The solution of (between 1 and blockingStoreFiles - compactionThreshold) seems unsatisfactory,
don't you agree.  Its hard to tell how it will play out over time on a cluster?

Looking at your patch, you might want to do a check for hbase.hstore.blockingStoreFiles >

> Outdated data can not be cleaned in time
> ----------------------------------------
>                 Key: HBASE-3969
>                 URL: https://issues.apache.org/jira/browse/HBASE-3969
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.90.1, 0.90.2, 0.90.3
>            Reporter: zhoushuaifeng
>             Fix For: 0.90.4
>         Attachments: HBASE-3969-solution1-for-branch.patch, HBASE-3969-solution1.patch
> Compaction checker will send regions to the compact queue to do compact. But the priority
of these regions is too low if these regions have only a few storefiles. When there is large
through output, and the compact queue will aways have some regions with higher priority. This
may causing the major compact be delayed for a long time(even a few days),  and outdated data
cleaning will also be delayed.
> In our test case, we found some regions sent to the queue by major compact checker hunging
in the queue for more than 2 days! Some scanners on these regions cannot get availably data
for a long time and lease expired.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message