Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B7DF9CEB1 for ; Thu, 5 Jul 2012 04:48:36 +0000 (UTC) Received: (qmail 94405 invoked by uid 500); 5 Jul 2012 04:48:36 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 94180 invoked by uid 500); 5 Jul 2012 04:48:36 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 94052 invoked by uid 99); 5 Jul 2012 04:48:35 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Jul 2012 04:48:35 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id CC18F142853 for ; Thu, 5 Jul 2012 04:48:34 +0000 (UTC) Date: Thu, 5 Jul 2012 04:48:34 +0000 (UTC) From: "Zhihong Ted Yu (JIRA)" To: issues@hbase.apache.org Message-ID: <599022209.7105.1341463714837.JavaMail.jiratomcat@issues-vm> In-Reply-To: <251312718.7041.1341460354549.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Updated] (HBASE-6329) Stop META regionserver when splitting region could cause daughter region assign twice MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-6329?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6329: ---------------------------------- Status: Patch Available (was: Open) =20 > Stop META regionserver when splitting region could cause daughter region = assign twice > -------------------------------------------------------------------------= ------------ > > Key: HBASE-6329 > URL: https://issues.apache.org/jira/browse/HBASE-6329 > Project: HBase > Issue Type: Bug > Components: master > Affects Versions: 0.94.0 > Reporter: chunhui shen > Assignee: chunhui shen > Attachments: HBASE-6329v1.patch > > > We found this issue in 0.94, first let me describe the case=EF=BC=9A > Stop META rs when split is in progress > 1.Stopping META rs(Server A). > 2.The main thread of rs close ZK and delete ephemeral node of the rs. > 3.SplitTransaction is retring MetaEditor.addDaughter > 4.Master's ServerShutdownHandler process the above dead META server > 5.Master fixup daughter and assign the daughter > 6.The daughter is opened on another server(Server B) > 7.Server A's splitTransaction successfully add the daughter to .META. wit= h serverName=3DServer A > 8.Now, in the .META., daughter's region location is Server A but it is on= lined on Server B > 9.Restart Master, and master will assign the daughter again. > Attaching the logs, daughter region 80f999ea84cb259e20e9a228546f6c8a > Master log: > 2012-07-04 13:45:56,493 INFO org.apache.hadoop.hbase.master.handler.Serve= rShutdownHandler: Splitting logs for dw93.kgb.sqa.cm4,60020,1341378224464 > 2012-07-04 13:45:58,983 INFO org.apache.hadoop.hbase.master.handler.Serve= rShutdownHandler: Fixup; missing daughter writetest,JC\xCA\xC8\xCFO= H\xCEV\xCC\xC2\xB5\xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a.= =20 > 2012-07-04 13:45:58,985 INFO org.apache.hadoop.hbase.catalog.MetaEditor: = Added daughter writetest,JC\xCA\xC8\xCFOH\xCEV\xCC\xC2\xB5\xC2@\xD4= ,1341380730558.80f999ea84cb259e20e9a228546f6c8a., serverName=3Dnull=20 > 2012-07-04 13:45:58,988 DEBUG org.apache.hadoop.hbase.master.AssignmentMa= nager: Assigning region writetest,JC\xCA\xC8\xCFOH\xCEV\xCC\xC2\xB5= \xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a. to dw88.kgb.sqa.c= m4,60020,1341379188777=20 > 2012-07-04 13:46:00,201 INFO org.apache.hadoop.hbase.master.AssignmentMan= ager: The master has opened the region writetest,JC\xCA\xC8\xCFOH\x= CEV\xCC\xC2\xB5\xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a. th= at was online on dw88.kgb.sqa.cm4,60020,1341379188777=20 > Master log after restart: > 2012-07-04 14:27:05,824 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:= master:60000-0x136187d60e34644 Creating (or updating) unassigned node for = 80f999ea84cb259e20e9a228546f6c8a with OFFLINE state=20 > 2012-07-04 14:27:05,851 INFO org.apache.hadoop.hbase.master.AssignmentMan= ager: Processing region writetest,JC\xCA\xC8\xCFOH\xCEV\xCC\xC2\xB5= \xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a. in state M_ZK_REG= ION_OFFLINE=20 > 2012-07-04 14:27:05,854 DEBUG org.apache.hadoop.hbase.master.AssignmentMa= nager: Assigning region writetest,JC\xCA\xC8\xCFOH\xCEV\xCC\xC2\xB5= \xC2@\xD4,1341380730558.80f999ea84cb259e20e9a228546f6c8a. to dw93.kgb.sqa.c= m4,60020,1341380812020=20 > 2012-07-04 14:27:06,051 DEBUG org.apache.hadoop.hbase.master.AssignmentMa= nager: Handling transition=3DRS_ZK_REGION_OPENED, server=3Ddw93.kgb.sqa.cm4= ,60020,1341380812020, region=3D80f999ea84cb259e20e9a228546f6c8a=20 > Regionserver(META rs) log: > 2012-07-04 13:45:56,491 INFO org.apache.hadoop.hbase.regionserver.HRegion= Server: stopping server dw93.kgb.sqa.cm4,60020,1341378224464; zookeeper con= nection c > losed. > 2012-07-04 13:46:11,951 INFO org.apache.hadoop.hbase.catalog.MetaEditor: = Added daughter writetest,JC\xCA\xC8\xCFOH\xCEV\xCC\xC2\xB5\xC2@\xD4= ,1341380730558.80f999ea84cb259e20e9a228546f6c8a., serverName=3Ddw93.kgb.sqa= .cm4,60020,1341378224464=20 > 2012-07-04 13:46:11,952 INFO org.apache.hadoop.hbase.regionserver.HRegion= Server: Done with post open deploy task for region=3Dwritetest,JC\xCA\xC8\x= CFOH\xCEV\xCC\xC2\xB5\xC2@\xD4,1341380730558.80f999ea84cb259e20e9a2= 28546f6c8a., daughter=3Dtrue=20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs: https://issues.apache.org/jira/secure/ContactAdministrators!default.jsp= a For more information on JIRA, see: http://www.atlassian.com/software/jira