Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B2A85D2FC for ; Fri, 9 Nov 2012 14:40:13 +0000 (UTC) Received: (qmail 61384 invoked by uid 500); 9 Nov 2012 14:40:13 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 61215 invoked by uid 500); 9 Nov 2012 14:40:13 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 61195 invoked by uid 99); 9 Nov 2012 14:40:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Nov 2012 14:40:12 +0000 Date: Fri, 9 Nov 2012 14:40:12 +0000 (UTC) From: "ramkrishna.s.vasudevan (JIRA)" To: issues@hbase.apache.org Message-ID: <520236632.91881.1352472012781.JavaMail.jiratomcat@arcas> In-Reply-To: <1760772325.73684.1352196613157.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HBASE-7103) Need to fail split if SPLIT znode is deleted even before the split is completed. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494025#comment-13494025 ] ramkrishna.s.vasudevan commented on HBASE-7103: ----------------------------------------------- @Lars HBASE-6088 added the new journal entry. Because previously the STARTED_SPLITTING was never added. So what happened was once we try to write the data RS_ZK_SPLITTING after creating the node and if that fails then on rollback we don take action and so subsequent splitting never happened. bq.can't we keep dictionary keyed by region of currently splitting regions in the RS? But the clearing of the dictionary should be done properly after the transition is done. Chances of race between the time we remove and the time we check if already present. May be we need to cross verify with the online regions list in the RS side. > Need to fail split if SPLIT znode is deleted even before the split is completed. > -------------------------------------------------------------------------------- > > Key: HBASE-7103 > URL: https://issues.apache.org/jira/browse/HBASE-7103 > Project: HBase > Issue Type: Bug > Reporter: ramkrishna.s.vasudevan > Assignee: ramkrishna.s.vasudevan > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-7103_testcase.patch > > > This came up after the following mail in dev list > 'infinite loop of RS_ZK_REGION_SPLIT on .94.2'. > The following is the reason for the problem > The following steps happen > -> Initially the parent region P1 starts splitting. > -> The split is going on normally. > -> Another split starts at the same time for the same region P1. (Not sure why this started). > -> Rollback happens seeing an already existing node. > -> This node gets deleted in rollback and nodeDeleted Event starts. > -> In nodeDeleted event the RIT for the region P1 gets deleted. > -> Because of this there is no region in RIT. > -> Now the first split gets over. Here the problem is we try to transit the node to SPLITTING to SPLIT. But the node even does not exist. > But we don take any action on this. We think it is successful. > -> Because of this SplitRegionHandler never gets invoked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira