Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 78D946D4B for ; Tue, 5 Jul 2011 00:03:47 +0000 (UTC) Received: (qmail 77150 invoked by uid 500); 5 Jul 2011 00:03:46 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 77059 invoked by uid 500); 5 Jul 2011 00:03:46 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 77051 invoked by uid 99); 5 Jul 2011 00:03:45 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Jul 2011 00:03:45 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of weihua.jiang@gmail.com designates 209.85.210.41 as permitted sender) Received: from [209.85.210.41] (HELO mail-pz0-f41.google.com) (209.85.210.41) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Jul 2011 00:03:40 +0000 Received: by pzk4 with SMTP id 4so2148627pzk.14 for ; Mon, 04 Jul 2011 17:03:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=riflZrkeENgS2jhCTx0bcgGIWSQVDVM0D79CHOM6DR0=; b=hReWvBXsch6UTNQq3DVMMMBEfnn1QwMjqRXZZbN5M334StOpeOA/vxSfcza7xt5ad8 tjTn9kpuQHE8kGhTn9jPeQp0sZI2xBiE0R3OrAlXSEQLLdmN2x4Lo8x99wni1XEp7DBT iMxjCYL2IaXm9drB1zG2zO22+LDArP7QtKc5k= MIME-Version: 1.0 Received: by 10.68.36.41 with SMTP id n9mr3693155pbj.88.1309824199737; Mon, 04 Jul 2011 17:03:19 -0700 (PDT) Received: by 10.68.57.40 with HTTP; Mon, 4 Jul 2011 17:03:19 -0700 (PDT) In-Reply-To: References: Date: Tue, 5 Jul 2011 08:03:19 +0800 Message-ID: Subject: Re: Region not online after split by a closing RS From: Weihua JIANG To: dev@hbase.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Both daughter regions are not online. Thanks Weihua 2011/7/4 Ted Yu : > In the future, please direct questions on cdh releases to > cdh-dev@cloudera.org > You may cc dev@hbase.apache.org > > There is more than one minute difference between master and RS logs. > Which one of the daughter regions didn't come online ? > > Cheers > > On Mon, Jul 4, 2011 at 5:30 AM, Weihua JIANG wro= te: > >> The HBase version we are using is CDH3U0. >> >> Thanks >> Weihua >> >> 2011/7/4 Weihua JIANG : >> > Hi all, >> > >> > We encountered a problem about region not onlining. A region is >> > splitted by a closing RS and then this RS down. It seems master has >> > known this split but it doesn't tried to make it online. Log from >> > master >> > 2011-06-30 22:58:52,945 DEBUG >> > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Offlined >> > and split region >> > >> CMCC_Detail_ReversePhoneMonth__DateCat_NONE,39999999999999960,1309422002= 877.de5cb72653d016804cbd16f4a71470cd.; >> > checking daughter presence >> > 2011-06-30 22:58:52,946 DEBUG >> > org.apache.hadoop.hbase.master.AssignmentManager: Handling >> > transition=3DRS_ZK_REGION_OPENING, >> > server=3Dhadoop01.sh.intel.com,50820,1309421825940, >> > region=3Ded60ec735e30db1d99290995eb1cd2d7 >> > 2011-06-30 22:58:53,005 DEBUG >> > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Daughter >> > >> CMCC_Detail_ReversePhoneMonth__DateCat_NONE,39999999999999960,1309445753= 679.e8054c8476b50e7648af747011d0c77e. >> > present >> > 2011-06-30 22:58:53,065 DEBUG >> > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Daughter >> > >> CMCC_Detail_ReversePhoneMonth__DateCat_NONE,40277780931201101,1309445753= 679.64d28c449c062d5ac569f8619a75c294. >> > present >> > >> > Log from RS is: >> > 2011-06-30 22:57:05,207 WARN org.apache.hadoop.ipc.HBaseServer: IPC >> > Server handler 73 on 50820 caught: >> > java.nio.channels.ClosedChannelException >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0at >> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126) >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0at sun.nio.ch.SocketChannelImpl.write(Socke= tChannelImpl.java:324) >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0at >> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:13= 42) >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0at >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseS= erver.java:727) >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0at >> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.= java:792) >> > =C2=A0 =C2=A0 =C2=A0 =C2=A0at >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:108= 3) >> > >> > 2011-06-30 22:57:05,207 INFO org.apache.hadoop.ipc.HBaseServer: IPC >> > Server handler 73 on 50820: exiting >> > 2011-06-30 22:57:05,767 INFO >> > org.apache.hadoop.hbase.regionserver.Leases: regionserver50820 closing >> > leases >> > 2011-06-30 22:57:05,768 INFO >> > org.apache.hadoop.hbase.regionserver.Leases: regionserver50820 closed >> > leases >> > 2011-06-30 22:57:05,768 INFO >> > >> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementat= ion: >> > Closed zookeeper sessionid=3D0x130ba69074900b4 >> > 2011-06-30 22:57:05,781 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x130ba69074900b4 closed >> > 2011-06-30 22:57:05,781 INFO org.apache.zookeeper.ClientCnxn: >> > EventThread shut down >> > 2011-06-30 22:57:05,857 DEBUG >> > org.apache.hadoop.hbase.regionserver.HRegion: Instantiated >> > >> CMCC_Detail_ReversePhoneMonth__DateCat_NONE,39999999999999960,1309445753= 679.e8054c8476b50e7648af747011d0c77e. >> > 2011-06-30 22:57:05,863 DEBUG >> > org.apache.hadoop.hbase.regionserver.HRegion: Instantiated >> > >> CMCC_Detail_ReversePhoneMonth__DateCat_NONE,40277780931201101,1309445753= 679.64d28c449c062d5ac569f8619a75c294. >> > 2011-06-30 22:57:05,911 INFO >> > org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent region >> > >> CMCC_Detail_ReversePhoneMonth__DateCat_NONE,39999999999999960,1309422002= 877.de5cb72653d016804cbd16f4a71470cd. >> > in META >> > 2011-06-30 22:57:05,942 INFO >> > org.apache.hadoop.hbase.catalog.MetaEditor: Added daughter >> > >> CMCC_Detail_ReversePhoneMonth__DateCat_NONE,39999999999999960,1309445753= 679.e8054c8476b50e7648af747011d0c77e. >> > in region .META.,,1, serverInfo=3Dnull >> > 2011-06-30 22:57:05,943 INFO >> > org.apache.hadoop.hbase.regionserver.SplitTransaction: Not opening >> > daughter >> CMCC_Detail_ReversePhoneMonth__DateCat_NONE,39999999999999960,1309445753= 679.e8054c8476b50e7648af747011d0c77e. >> > because stopping=3Dfalse, stopped=3Dtrue >> > 2011-06-30 22:57:05,950 INFO >> > org.apache.hadoop.hbase.catalog.MetaEditor: Added daughter >> > >> CMCC_Detail_ReversePhoneMonth__DateCat_NONE,40277780931201101,1309445753= 679.64d28c449c062d5ac569f8619a75c294. >> > in region .META.,,1, serverInfo=3Dnull >> > 2011-06-30 22:57:05,950 INFO >> > org.apache.hadoop.hbase.regionserver.SplitTransaction: Not opening >> > daughter >> CMCC_Detail_ReversePhoneMonth__DateCat_NONE,40277780931201101,1309445753= 679.64d28c449c062d5ac569f8619a75c294. >> > because stopping=3Dfalse, stopped=3Dtrue >> > 2011-06-30 22:57:06,004 INFO >> > org.apache.hadoop.hbase.regionserver.SplitRequest: Region split, META >> > updated, and report to master. >> > >> Parent=3DCMCC_Detail_ReversePhoneMonth__DateCat_NONE,39999999999999960,1= 309422002877.de5cb72653d016804cbd16f4a71470cd., >> > new regions: >> CMCC_Detail_ReversePhoneMonth__DateCat_NONE,39999999999999960,1309445753= 679.e8054c8476b50e7648af747011d0c77e., >> > >> CMCC_Detail_ReversePhoneMonth__DateCat_NONE,40277780931201101,1309445753= 679.64d28c449c062d5ac569f8619a75c294.. >> > Split took 1mins, 12sec >> > 2011-06-30 22:57:06,004 DEBUG >> > org.apache.hadoop.hbase.regionserver.CompactSplitThread: Waiting for >> > Split Thread to finish... >> > 2011-06-30 22:57:06,004 DEBUG >> > org.apache.hadoop.hbase.regionserver.CompactSplitThread: Waiting for >> > Large Compaction Thread to finish... >> > 2011-06-30 22:57:06,004 DEBUG >> > org.apache.hadoop.hbase.regionserver.CompactSplitThread: Waiting for >> > Small Compaction Thread to finish... >> > 2011-06-30 22:57:06,004 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver50820 >> > exiting >> > 2011-06-30 22:57:06,090 INFO >> > org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook >> > starting; hbase.shutdown.hook=3Dtrue; >> > fsShutdownHook=3DThread[Thread-15,5,main] >> > 2011-06-30 22:57:06,090 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown >> > hook >> > 2011-06-30 22:57:06,090 INFO >> > org.apache.hadoop.hbase.regionserver.ShutdownHook: Starting fs >> > shutdown hook thread. >> > 2011-06-30 22:57:06,196 INFO >> > org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook >> > finished. >> > >> > >> > Thanks >> > Weihua >> > >> >