hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shuaifeng Zhou (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14735) Region may grow too big and can not be split
Date Fri, 27 Nov 2015 01:46:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029342#comment-15029342

Shuaifeng Zhou commented on HBASE-14735:

Thanks a lot for the explain, [~stack]
We met the problem. The huge region can not be compacted to a few files because high input
load, and if cannot be split, the input load aways on the region, this situation become worse
and worse.
If split the region to 2, the input load will be split and balanced on the 2 children.
What you wary about the patch is reasonable, we also met the the reference file problem. After
we apply the patch on our cluster, the huge region also cannot be split, because there is
a reference file, for some reason, the file aways cannot be selected to compact, and we sent
a major compact request to solve the problem. The patch may not solve the huge region problem,
but can prevent it.
In the patch, we respect the rule that compact comes first, but give a chance to split if
region is too big. 
If region split before it grows too big, compact on the children may be easily, and can clean
the reference intime before the children grow too big. 

> Region may grow too big and can not be split
> --------------------------------------------
>                 Key: HBASE-14735
>                 URL: https://issues.apache.org/jira/browse/HBASE-14735
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction, regionserver
>    Affects Versions: 1.1.2, 0.98.15
>            Reporter: Shuaifeng Zhou
>            Assignee: Shuaifeng Zhou
>         Attachments: 14735-0.98.patch, 14735-branch-1.1.patch, 14735-branch-1.2.patch,
14735-branch-1.2.patch, 14735-master (2).patch, 14735-master.patch, 14735-master.patch
> When a compaction completed, may there are also many storefiles in the store, and CompactPriority
< 0, then compactSplitThread will do a "Recursive enqueue" compaction request instead of
request a split:
> {code:title=CompactSplitThread.java|borderStyle=solid}
>         if (completed) {
>           // degenerate case: blocked regions require recursive enqueues
>           if (store.getCompactPriority() <= 0) {
>             requestSystemCompaction(region, store, "Recursive enqueue");
>           } else {
>             // see if the compaction has caused us to exceed max region size
>             requestSplit(region);
>           }
> {code}
> But in some situation, the "recursive enqueue" request may return null, and not build
up a new compaction runner. For example, an other compaction of the same region is running,
and compaction selection will exclude all files older than the newest files currently compacting,
this may cause no enough files can be selected by the "recursive enqueue" request. When this
happen, split will not be trigged. If the input load is high enough, compactions aways running
on the region, and split will never be triggered.
> In our cluster, this situation happened, and a huge region more than 400GB and 100+ storefiles
appeared. Version is 0.98.10, and the trank also have the problem.

This message was sent by Atlassian JIRA

View raw message