hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jeongmin kim (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13474) stripe compaction selection is not working well when includeL0==true
Date Thu, 16 Apr 2015 06:35:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497623#comment-14497623

jeongmin kim commented on HBASE-13474:

No, I have not seen it.
just I read codes to solve another problem, and it caught my eyes.

I didn't demonstrate it, but i think as the size of the region is getting bigger, it's going
to be happened.
(of course I may be wrong :)  )

includeL0 is going to be true until the compaction.
And the compaction is going to be cancelled until the ratio of Hfiles is appropriate.
But if one region is big 
so, if a HFile is big (like 1G or over)
includeL0 makes to try all files to compact and it'll cancel.
flushes make a tiny Files that they can't compacted until the sum of it is big.

    FileSize(i) <= Sum(0,N,FileSize(_)) - FileSize(i) ) * ratio 
    if the fileSize is IG and ratio is 1.2(default),
    the sum of other files size have to be 850Mb.
    until that, there's no compaction on this region.

> stripe compaction selection is not working well when includeL0==true
> --------------------------------------------------------------------
>                 Key: HBASE-13474
>                 URL: https://issues.apache.org/jira/browse/HBASE-13474
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>    Affects Versions: 1.0.1, 0.98.12
>            Reporter: jeongmin kim
>            Assignee: jeongmin kim
>         Attachments: HBASE-13474.patch
> during selecting Hfiles for the Stripe Compaction,
> If includeL0==true, int minFiles set to the number of allFiles in the stripe.
> It make compaction for All of files in the stripe 
> or No compaction at all (which is the problem). 
> the Stripe compaction uses exploring compaction inside.
> some of HFiles in the stripe is too big, these all files are gonna never pass the ratio
> and compaction will be cancelled 
> next time the compaction is occurred, includeL0 will be true again. 
> so compactions (even minor compactions) will not happen almost forever.
> Flushing makes more small HFiles and no compaction happening,
> so numerous tiny HFiles are gonna file up in the stripe, and it’s going to be a problem.
> there is no such thing as major compaction in the stripe compaction.
> But we need to compact every file of the stripe to drop deletes at some point.
> IMHO  there is not one stripe in the region, there are many.
> when includeL0==true, compact all HFiles in the stripe without selecting is reasonable.

This message was sent by Atlassian JIRA

View raw message