Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A31149A2B for ; Wed, 5 Oct 2011 18:06:56 +0000 (UTC) Received: (qmail 3904 invoked by uid 500); 5 Oct 2011 18:06:56 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 3860 invoked by uid 500); 5 Oct 2011 18:06:56 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 3819 invoked by uid 99); 5 Oct 2011 18:06:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Oct 2011 18:06:56 +0000 X-ASF-Spam-Status: No, hits=-2000.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Oct 2011 18:06:52 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id C984728A007 for ; Wed, 5 Oct 2011 18:06:30 +0000 (UTC) Date: Wed, 5 Oct 2011 18:06:30 +0000 (UTC) From: "ramkrishna.s.vasudevan (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <357829663.82.1317837990826.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <605547947.11024.1317807454218.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-4540) OpenedRegionHandler is not enforcing atomicity of the operation it is performing MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121299#comment-13121299 ] ramkrishna.s.vasudevan commented on HBASE-4540: ----------------------------------------------- The reason for getting the znode version for the following scenario -> RS1 tries opening a region by transiting it to OPENED ->OpenedRegionHandler has still not processed. -> RS1 goes down and the region is assigned to RS2. -> RS2 has transited the node to OPENED -> Now the OpenedRegionHandler will try to delete the znode and it will succeed thinking the region is in RS1. -> To avoid the above scenario i have tried to use the znode version that comes along when we get the callback after transiting the node to OPENED state. > OpenedRegionHandler is not enforcing atomicity of the operation it is performing > -------------------------------------------------------------------------------- > > Key: HBASE-4540 > URL: https://issues.apache.org/jira/browse/HBASE-4540 > Project: HBase > Issue Type: Bug > Reporter: ramkrishna.s.vasudevan > Assignee: ramkrishna.s.vasudevan > Attachments: HBASE-4540_1.patch > > > -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened by RS1. > -> RS1 goes down. > -> Servershutdownhandler assigns the region R1 to RS2. > -> The znode of R1 is moved to OFFLINE state by master or OPENING state by RS2 if RS2 has started opening the region. > -> Now the first OpenedRegionHandler tries to delete the znode thinking its in OPENED state but fails. > -> Though it fails it removes the node from RIT and adds RS1 as the owner of R1 in master's memory. > -> Now when RS2 completes opening the region the master is not able to open the region as already the reigon has been deleted from RIT. > {code} > Master > ====== > 2011-10-05 20:49:45,301 INFO org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished processing of shutdown of linux146,60020,1317827727647 > 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not running balancer because 1 region(s) in transition: {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847} > 2011-10-05 20:49:57,720 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9 > 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Deleting existing unassigned node for 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED > 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:60000-0x132d3dc13090023 Attempting to delete unassigned node 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in RS_ZK_REGION_OPENING state > After the region is opened in RS2 > ================================= > 2011-10-05 20:50:48,066 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late > 2011-10-05 20:50:48,290 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in the state null and not in expected PENDING_OPEN or OPENING states > 2011-10-05 20:50:53,743 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, region=3e69d628a8bd8e9b7c5e7a2a6e03aad9 > 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s) > 2011-10-05 20:50:54,397 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but region was in the state null and not in expected PENDING_OPEN or OPENING states > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira