hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4729) Race between online altering and splitting kills the master
Date Tue, 15 Nov 2011 19:55:52 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150722#comment-13150722
] 

Hadoop QA commented on HBASE-4729:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12503776/4729.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 javadoc.  The javadoc tool appears to have generated -163 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 51 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit
warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.master.TestDistributedLogSplitting
                  org.apache.hadoop.hbase.replication.TestReplication
                  org.apache.hadoop.hbase.io.hfile.TestHFileBlock

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/256//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/256//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/256//console

This message is automatically generated.
                
> Race between online altering and splitting kills the master
> -----------------------------------------------------------
>
>                 Key: HBASE-4729
>                 URL: https://issues.apache.org/jira/browse/HBASE-4729
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0, 0.94.0
>
>         Attachments: 4729.txt
>
>
> I was running an online alter while regions were splitting, and suddenly the master died
and left my table half-altered (haven't restarted the master yet).
> What killed the master:
> {quote}
> 2011-11-02 17:06:44,428 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected ZK exception
creating node CLOSING
> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists
for /hbase/unassigned/f7e1783e65ea8d621a4bc96ad310f101
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
>         at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>         at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
>         at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:459)
>         at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:441)
>         at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:769)
>         at org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:568)
>         at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1722)
>         at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1661)
>         at org.apache.hadoop.hbase.master.BulkReOpen$1.run(BulkReOpen.java:69)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {quote}
> A znode was created because the region server was splitting the region 4 seconds before:
> {quote}
> 2011-11-02 17:06:40,704 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting
split of region TestTable,0012469153,1320253135043.f7e1783e65ea8d621a4bc96ad310f101.
> 2011-11-02 17:06:40,704 DEBUG org.apache.hadoop.hbase.regionserver.SplitTransaction:
regionserver:62023-0x132f043bbde0710 Creating ephemeral node for f7e1783e65ea8d621a4bc96ad310f101
in SPLITTING state
> 2011-11-02 17:06:40,751 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:62023-0x132f043bbde0710
Attempting to transition node f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING
to RS_ZK_REGION_SPLITTING
> ...
> 2011-11-02 17:06:44,061 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:62023-0x132f043bbde0710
Successfully transitioned node f7e1783e65ea8d621a4bc96ad310f101 from RS_ZK_REGION_SPLITTING
to RS_ZK_REGION_SPLIT
> 2011-11-02 17:06:44,061 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Still
waiting on the master to process the split for f7e1783e65ea8d621a4bc96ad310f101
> {quote}
> Now that the master is dead the region server is spewing those last two lines like mad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message