hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7403) Online Merge
Date Sun, 10 Mar 2013 05:47:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13598168#comment-13598168
] 

Ted Yu commented on HBASE-7403:
-------------------------------

{code}
+    if (!onSameRS) {
+      // Move region_b to region a's location
+      RegionPlan regionPlan = new RegionPlan(region_b, region_b_location,
+          region_a_location);
{code}
Can we consider metrics so that region with less load is moved onto region server where region
with more load resides ?
                
> Online Merge
> ------------
>
>                 Key: HBASE-7403
>                 URL: https://issues.apache.org/jira/browse/HBASE-7403
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.95.0, 0.94.6
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.95.0
>
>         Attachments: 7403-trunkv5.patch, 7403-trunkv6.patch, 7403v5.diff, 7403-v5.txt,
7403v5.txt, hbase-7403-94v1.patch, hbase-7403-trunkv10.patch, hbase-7403-trunkv11.patch, hbase-7403-trunkv12.patch,
hbase-7403-trunkv13.patch, hbase-7403-trunkv14.patch, hbase-7403-trunkv15.patch, hbase-7403-trunkv16.patch,
hbase-7403-trunkv19.patch, hbase-7403-trunkv1.patch, hbase-7403-trunkv5.patch, hbase-7403-trunkv6.patch,
hbase-7403-trunkv7.patch, hbase-7403-trunkv8.patch, hbase-7403-trunkv9.patch, merge region.pdf
>
>
> The feature of this online merge:
> 1.Online,no necessary to disable table
> 2.Less change for current code, could applied in trunk,0.94 or 0.92,0.90
> 3.Easy to call merege request, no need to input a long region name, only encoded name
enough
> 4.No limit when operation, you don't need to tabke care the events like Server Dead,
Balance, Split, Disabing/Enabing table, no need to take care whether you send a wrong merge
request, it has alread done for you
> 5.Only little offline time for two merging regions
> Usage:
> 1.Tool:  
> bin/hbase org.apache.hadoop.hbase.util.OnlineMerge [-force] [-async] [-show] <table-name>
<region-encodedname-1> <region-encodedname-2>
> 2.API: static void MergeManager#createMergeRequest
> We need merge in the following cases:
> 1.Region hole or region overlap, can’t be fix by hbck
> 2.Region become empty because of TTL and not reasonable Rowkey design
> 3.Region is always empty or very small because of presplit when create table
> 4.Too many empty or small regions would reduce the system performance(e.g. mslab)
> Current merge tools only support offline and are not able to redo if exception is thrown
in the process of merging, causing a dirty data
> For online system, we need a online merge.
> This implement logic of this patch for  Online Merge is :
> For example, merge regionA and regionB into regionC
> 1.Offline the two regions A and B
> 2.Merge the two regions in the HDFS(Create regionC’s directory, move regionA’s and
regionB’s file to regionC’s directory, delete regionA’s and regionB’s directory)
> 3.Add the merged regionC to .META.
> 4.Assign the merged regionC
> As design of this patch , once we do the merge work in the HDFS,we could redo it until
successful if it throws exception or abort or server restart, but couldn’t be rolled back.

> It depends on
> Use zookeeper to record the transaction journal state, make redo easier
> Use zookeeper to send/receive merge request
> Merge transaction is executed on the master
> Support calling merge request through API or shell tool
> About the merge process, please see the attachment and patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message