Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1F2959AAE for ; Thu, 16 Aug 2012 00:48:39 +0000 (UTC) Received: (qmail 91745 invoked by uid 500); 16 Aug 2012 00:48:38 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 91684 invoked by uid 500); 16 Aug 2012 00:48:38 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 91667 invoked by uid 99); 16 Aug 2012 00:48:38 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Aug 2012 00:48:38 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 178182C5BE9 for ; Thu, 16 Aug 2012 00:48:38 +0000 (UTC) Date: Thu, 16 Aug 2012 11:48:38 +1100 (NCT) From: "Zhihong Ted Yu (JIRA)" To: issues@hbase.apache.org Message-ID: <451395997.17104.1345078118096.JavaMail.jiratomcat@arcas> In-Reply-To: <439026536.12010.1345013917960.JavaMail.jiratomcat@arcas> Subject: [jira] [Updated] (HBASE-6587) Region would be assigned twice in the case of all RS offline MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-6587?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6587: ---------------------------------- Attachment: 6587.patch Patch with minor reformatting. Going to integrate tomorrow if there is no objection. =20 > Region would be assigned twice in the case of all RS offline > ------------------------------------------------------------ > > Key: HBASE-6587 > URL: https://issues.apache.org/jira/browse/HBASE-6587 > Project: HBase > Issue Type: Bug > Affects Versions: 0.94.1 > Reporter: chunhui shen > Assignee: chunhui shen > Fix For: 0.96.0 > > Attachments: 6587.patch, HBASE-6587.patch > > > In the TimeoutMonitor, we would act on time out for the regions if (this.= allRegionServersOffline && !noRSAvailable) > The code is as the following: > {code} > if (regionState.getStamp() + timeout <=3D now || > (this.allRegionServersOffline && !noRSAvailable)) { > //decide on action upon timeout or, if some RSs just came back = online, we can start the > // the assignment > actOnTimeOut(regionState); > } > {code} > But we found it exists a bug that it would act on time out for the region= which was assigned just now , and cause assigning the region twice. > Master log for the region 277b9b6df6de2b9be1353b4fa25f4222: > {code} > 2012-08-14 20:42:54,367 DEBUG org.apache.hadoop.hbase.master.AssignmentMa= nager: Unable to determine a plan to assign .META.,,1.1028785192 state=3DOF= FLINE, ts=3D1 > 344948174367, server=3Dnull > 2012-08-14 20:44:31,640 DEBUG org.apache.hadoop.hbase.master.AssignmentMa= nager: No previous transition plan was found (or we are ignoring an existin= g plan) for writete > st,VHXYHJN0BL48HMR4DI1L,1344925649429.277b9b6df6de2b9be1353b4fa25f4222. s= o generated a random one; hri=3Dwritetest,VHXYHJN0BL48HMR4DI1L,134492564942= 9.277b9b6df6de2b9be13 > 53b4fa25f4222., src=3D, dest=3Ddw92.kgb.sqa.cm4,60020,1344948267642; 1 (o= nline=3D1, available=3D1) available servers > 2012-08-14 20:44:31,640 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:= master:60000-0x438f53bbf9b0acd Creating (or updating) unassigned node for = 277b9b6df6de2b9be13 > 53b4fa25f4222 with OFFLINE state > 2012-08-14 20:44:31,643 DEBUG org.apache.hadoop.hbase.master.AssignmentMa= nager: Assigning region writetest,VHXYHJN0BL48HMR4DI1L,1344925649429.277b9b= 6df6de2b9be1353b4fa > 25f4222. to dw92.kgb.sqa.cm4,60020,1344948267642 > 2012-08-14 20:44:32,291 DEBUG org.apache.hadoop.hbase.master.AssignmentMa= nager: Handling transition=3DRS_ZK_REGION_OPENING, server=3Ddw92.kgb.sqa.cm= 4,60020,1344948267642,=20 > region=3D277b9b6df6de2b9be1353b4fa25f4222 > // =E5=BC=82=E5=B8=B8=E7=9A=84=E8=B6=85=E6=97=B6 > 2012-08-14 20:44:32,518 INFO org.apache.hadoop.hbase.master.AssignmentMan= ager: Regions in transition timed out: writetest,VHXYHJN0BL48HMR4DI1L,13449= 25649429.277b9b6df > 6de2b9be1353b4fa25f4222. state=3DOPENING, ts=3D1344948272279, server=3Ddw= 92.kgb.sqa.cm4,60020,1344948267642 > 2012-08-14 20:44:32,518 INFO org.apache.hadoop.hbase.master.AssignmentMan= ager: Region has been OPENING for too long, reassigning region=3Dwritetest,= VHXYHJN0BL48HMR4DI1L, > 1344925649429.277b9b6df6de2b9be1353b4fa25f4222. > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs: https://issues.apache.org/jira/secure/ContactAdministrators!default.jsp= a For more information on JIRA, see: http://www.atlassian.com/software/jira