Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7B8AF9678 for ; Sat, 30 Jun 2012 02:51:45 +0000 (UTC) Received: (qmail 20725 invoked by uid 500); 30 Jun 2012 02:51:45 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 20602 invoked by uid 500); 30 Jun 2012 02:51:45 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 20575 invoked by uid 99); 30 Jun 2012 02:51:44 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Jun 2012 02:51:44 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id EE8651418F1 for ; Sat, 30 Jun 2012 02:51:43 +0000 (UTC) Date: Sat, 30 Jun 2012 02:51:43 +0000 (UTC) From: "Hadoop QA (JIRA)" To: issues@hbase.apache.org Message-ID: <1711779902.74510.1341024703979.JavaMail.jiratomcat@issues-vm> In-Reply-To: <1791500473.65769.1340870024027.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404375#comment-13404375 ] Hadoop QA commented on HBASE-6289: ---------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12534084/HBASE-6289-v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2298//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2298//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2298//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2298//console This message is automatically generated. > ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. > -------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-6289 > URL: https://issues.apache.org/jira/browse/HBASE-6289 > Project: HBase > Issue Type: Bug > Components: master > Affects Versions: 0.90.6, 0.94.0 > Reporter: Maryann Xue > Assignee: Maryann Xue > Priority: Critical > Attachments: HBASE-6289-v2.patch, HBASE-6289.patch > > > The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. > {code} > private void verifyAndAssignRoot() > throws InterruptedException, IOException, KeeperException { > long timeout = this.server.getConfiguration(). > getLong("hbase.catalog.verification.timeout", 1000); > if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { > this.services.getAssignmentManager().assignRoot(); > } > } > {code} > After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: > {code} > 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 > 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 > 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 > 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 > 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira