Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1D9A1908F for ; Wed, 16 May 2012 08:35:29 +0000 (UTC) Received: (qmail 86803 invoked by uid 500); 16 May 2012 08:35:28 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 86694 invoked by uid 500); 16 May 2012 08:35:28 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 86617 invoked by uid 99); 16 May 2012 08:35:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 May 2012 08:35:27 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 May 2012 08:35:22 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id D691C756D for ; Wed, 16 May 2012 08:35:02 +0000 (UTC) Date: Wed, 16 May 2012 08:35:02 +0000 (UTC) From: "chunhui shen (JIRA)" To: issues@hbase.apache.org Message-ID: <749522799.3357.1337157302880.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <68467124.22364.1336060735146.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-5927) SSH and DisableTableHandler happening together does not clear the znode of the region and RIT map. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276592#comment-13276592 ] chunhui shen commented on HBASE-5927: ------------------------------------- bq.But why should an assign happen and then go with reassign when we already know it is a disabling table Yes, we needn't do the reassign. But I think it's the problem of ServerShutdownHandler#processDeadRegion, why not return false for disabling table region. So I think we could modify a little for the ServerShutdownHandler#processDeadRegion: {code} public static boolean processDeadRegion(HRegionInfo hri, Result result, AssignmentManager assignmentManager, CatalogTracker catalogTracker) throws IOException { ... if (hri.isOffline() && hri.isSplit()) { LOG.debug("Offlined and split region " + hri.getRegionNameAsString() + "; checking daughter presence"); if (MetaReader.getRegion(catalogTracker, hri.getRegionName()) == null) { return false; } fixupDaughters(result, assignmentManager, catalogTracker); return false; } // If table is not disabled but the region is offlined, boolean disabling = assignmentManager.getZKTable().isDisablingTable( hri.getTableNameAsString()); if (disabling) { LOG.info("The table " + hri.getTableNameAsString() + " is disabling. Hence not assign it."); return false; } return true; } {code} > SSH and DisableTableHandler happening together does not clear the znode of the region and RIT map. > -------------------------------------------------------------------------------------------------- > > Key: HBASE-5927 > URL: https://issues.apache.org/jira/browse/HBASE-5927 > Project: HBase > Issue Type: Bug > Components: master > Affects Versions: 0.92.1, 0.96.0, 0.94.1 > Reporter: Jieshan Bean > Assignee: Jieshan Bean > Fix For: 0.92.2, 0.96.0, 0.94.1 > > Attachments: HBASE-5927_94.patch, HBASE-5927_94_v2.patch, HBASE-5927_trunk.patch, HBASE-5927_trunk_2.patch, TestCaseForReProduce.txt > > > A possible exception: If the related regionserver was just killed(But HMaster has not perceived that), then we will get a local exception "Connection reset by peer". If this region belongs to a disabling table. what will happen? > ServerShutdownHandler will remove this region from AM#regions. So this region is still existing in RIT. TimeoutMonitor will take care of it after it got timeout. Then invoke unassign again. Since this region has been removed from AM#regions, it will return directly due to the below code: > {code} > synchronized (this.regions) { > // Check if this region is currently assigned > if (!regions.containsKey(region)) { > LOG.debug("Attempted to unassign region " + > region.getRegionNameAsString() + " but it is not " + > "currently assigned anywhere"); > return; > } > } > {code} > Then it leads to an end-less loop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira