Return-Path: Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: (qmail 82713 invoked from network); 23 Nov 2010 18:52:37 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 23 Nov 2010 18:52:37 -0000 Received: (qmail 41030 invoked by uid 500); 23 Nov 2010 18:53:09 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 40859 invoked by uid 500); 23 Nov 2010 18:53:08 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 40851 invoked by uid 99); 23 Nov 2010 18:53:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Nov 2010 18:53:08 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jdcryans@gmail.com designates 209.85.161.41 as permitted sender) Received: from [209.85.161.41] (HELO mail-fx0-f41.google.com) (209.85.161.41) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Nov 2010 18:53:03 +0000 Received: by fxm20 with SMTP id 20so6244040fxm.14 for ; Tue, 23 Nov 2010 10:52:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:content-type:content-transfer-encoding; bh=Kl3YdVEZS3SeOxdkqQwTTX4y4tfx/OG0GqkGVVLvVGw=; b=GSB/zcafuzQtWQefoeUXDuhhUxyHdFc0po8xMbXiBzAzZ+4L63F0wmwWYL3I1WB6Z0 PSKeu6FdYSQw2IaJmWRRJ1WOsttFbafsrzuk97Mj+lPu3Devo3M5J4dN+lh00okDP6q1 RYTAjB4MUMfTWW7YqPGLrNjt5nwvsOQk9dMsg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=BO8fjWRvw2XXIjDyg8bONdRu9buYC0LFhLtRqVoaJx7SVuSFI8z7aSu2kcSP39RCpD 22dP0nhyutIuJeal1hJ4e0vZKXA7W2SGhHeDh/+dtR2KblWEZLlqUo1bOVfqDbMUXC5B i4asyArsXz4vU1QTpdUztJVrvkmL7HBDEPjAw= MIME-Version: 1.0 Received: by 10.223.122.133 with SMTP id l5mr5748593far.52.1290538361766; Tue, 23 Nov 2010 10:52:41 -0800 (PST) Sender: jdcryans@gmail.com Received: by 10.223.86.143 with HTTP; Tue, 23 Nov 2010 10:52:41 -0800 (PST) In-Reply-To: References: Date: Tue, 23 Nov 2010 10:52:41 -0800 X-Google-Sender-Auth: LFy77k72WKkohmBbUr_32wR5-Bo Message-ID: Subject: Re: problem starting HBase From: Jean-Daniel Cryans To: dev@hbase.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable So before that first line... everything was fine? What happened there? On Tue, Nov 23, 2010 at 10:48 AM, Ted Yu wrote: > Here is relevant portion of master log: > http://pastebin.com/HiembAxc > > On Tue, Nov 23, 2010 at 10:37 AM, Jean-Daniel Cryans wrote: > >> Can you dig up to where it started doing that? >> >> On Tue, Nov 23, 2010 at 10:27 AM, Ted Yu wrote: >> > http://pastebin.com/E86iPnK4 >> > >> > On Tue, Nov 23, 2010 at 10:23 AM, Jean-Daniel Cryans < >> jdcryans@apache.org>wrote: >> > >> >> Is that really from the master log? Can we get the full log in a >> pastebin? >> >> >> >> J-D >> >> >> >> On Tue, Nov 23, 2010 at 7:40 AM, Ted Yu wrote: >> >> > I backed up zookeeper dataDir to another location. >> >> > After clearing zookeeper dataDir, HMaster still couldn't start: >> >> > >> >> > 2010-11-23 15:30:56,095 DEBUG >> >> > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode >> >> /hbase/master >> >> > got 10.202.50.100:60000 >> >> > 2010-11-23 15:30:56,119 DEBUG >> >> > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to read: >> >> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCo= de >> =3D >> >> > NoNode for /hbase/root-region-server >> >> > 2010-11-23 15:30:56,120 DEBUG >> >> > org.apache.hadoop.hbase.client.HConnectionManager$TableServers: >> Sleeping >> >> > 5000ms, waiting for root region. >> >> > 2010-11-23 15:31:01,125 DEBUG >> >> > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to read: >> >> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCo= de >> =3D >> >> > NoNode for /hbase/root-region-server >> >> > 2010-11-23 15:31:01,125 DEBUG >> >> > org.apache.hadoop.hbase.client.HConnectionManager$TableServers: >> Sleeping >> >> > 5000ms, waiting for root region. >> >> > >> >> > Disk isn't full: >> >> > /dev/md2 =A0 =A0 =A0 =A0 =A0 =A0 2786058952 186234928 2456017408 = =A0 8% / >> >> > >> >> > Comment is appreciated. >> >> > >> >> > On Tue, Nov 23, 2010 at 5:34 AM, Ted Yu wrote= : >> >> > >> >> >> I tried to restart hbase. But the region server identified by ZNod= e >> >> >> /hbase/root-region-server declared that it is not serving root >> region: >> >> >> >> >> >> 2010-11-23 13:26:49,617 DEBUG >> >> >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: >> >> >> locateRegionInMeta attempt 1 of 3 failed; retrying after sleep of >> 5000 >> >> >> because: Timed out trying to locate root region because: >> >> >> org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0 >> >> >> =A0 =A0 =A0 =A0 at >> >> >> >> >> >> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServ= er.java:2274) >> >> >> =A0 =A0 =A0 =A0 at >> >> >> >> >> >> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionInfo(HRegion= Server.java:1711) >> >> >> =A0 =A0 =A0 =A0 at sun.reflect.NativeMethodAccessorImpl.invoke0(Na= tive >> Method) >> >> >> =A0 =A0 =A0 =A0 at >> >> >> >> >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav= a:39) >> >> >> =A0 =A0 =A0 =A0 at >> >> >> >> >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor= Impl.java:25) >> >> >> =A0 =A0 =A0 =A0 at java.lang.reflect.Method.invoke(Method.java:597= ) >> >> >> =A0 =A0 =A0 =A0 at >> >> >> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657= ) >> >> >> =A0 =A0 =A0 =A0 at >> >> >> >> >> >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:998= ) >> >> >> >> >> >> 2010-11-23 13:26:54,622 DEBUG >> >> >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode >> >> >> /hbase/root-region-server got 10.202.50.111:60020 >> >> >> 2010-11-23 13:26:54,624 DEBUG >> >> >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Ro= ot >> >> region >> >> >> location changed. Sleeping. >> >> >> 2010-11-23 13:26:59,626 DEBUG >> >> >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Wa= ke. >> >> Retry >> >> >> finding root region. >> >> >> 2010-11-23 13:26:59,629 DEBUG >> >> >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode >> >> >> /hbase/root-region-server got 10.202.50.111:60020 >> >> >> 2010-11-23 13:26:59,630 DEBUG >> >> >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Ro= ot >> >> region >> >> >> location changed. Sleeping. >> >> >> 2010-11-23 13:27:04,632 DEBUG >> >> >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Wa= ke. >> >> Retry >> >> >> finding root region. >> >> >> 2010-11-23 13:27:04,635 DEBUG >> >> >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode >> >> >> /hbase/root-region-server got 10.202.50.111:60020 >> >> >> >> >> >> What should I do next ? >> >> >> >> >> >> Thanks >> >> >> >> >> >> >> >> >> On Tue, Nov 23, 2010 at 1:37 AM, Lars George > >> >wrote: >> >> >> >> >> >>> Hi Ted, >> >> >>> >> >> >>> So one of the regions is not being released? Could you try and se= e >> >> >>> from .META. which is still deployed and use the shell's >> "close_region" >> >> >>> to close it while looking at the master and region server logs to >> see >> >> >>> what is going on? Maybe best if you switch the RS to DEBUG level >> >> >>> logging first to get some info? >> >> >>> >> >> >>> Lars >> >> >>> >> >> >>> On Tue, Nov 23, 2010 at 8:25 AM, Ted Yu >> wrote: >> >> >>> > Hi >> >> >>> > We use 0.20.6 >> >> >>> > >> >> >>> > I tried to disable packageindex table. From master log: >> >> >>> > >> >> >>> > 2010-11-23 07:21:06,326 DEBUG >> >> >>> > org.apache.hadoop.hbase.master.ChangeTableState: Adding region >> >> >>> > packageindex,CC7E6FEA4CDCF19C6F4AC9BB51EF6A33,1290230596786 to >> >> >>> setClosing >> >> >>> > list for us01-ciqps1-grid10.carrieriq.com,60020,1290493641949 >> >> >>> > 2010-11-23 07:21:06,326 DEBUG >> >> >>> > org.apache.hadoop.hbase.master.ChangeTableState: Adding region >> >> >>> > packageindex,F2A18967F48C9FDA9C23BF9A8210ED17,1290230394345 to >> >> >>> setClosing >> >> >>> > list for us01-ciqps1-grid11.carrieriq.com,60020,1290493641228 >> >> >>> > 2010-11-23 07:21:06,326 DEBUG >> >> >>> > org.apache.hadoop.hbase.master.ChangeTableState: Adding region >> >> >>> > packageindex,E8FA713B2F030EF012E5AB0A641CB1DB,1290230356969 to >> >> >>> setClosing >> >> >>> > list for us01-ciqps1-grid11.carrieriq.com,60020,1290493641228 >> >> >>> > 2010-11-23 07:21:06,327 DEBUG >> >> >>> > org.apache.hadoop.hbase.master.ChangeTableState: Adding region >> >> >>> > packageindex,5B10CA26DCAEFBFF4A63DB7D0432D628,1290229869191 to >> >> >>> setClosing >> >> >>> > list for us01-ciqps1-grid12.carrieriq.com,60020,1290493641232 >> >> >>> > 2010-11-23 07:21:20,178 INFO >> >> >>> org.apache.hadoop.hbase.master.ServerManager: >> >> >>> > 15 region servers, 0 dead, average load 123.66666666666667 >> >> >>> > 2010-11-23 07:21:20,252 INFO >> >> org.apache.hadoop.hbase.master.BaseScanner: >> >> >>> > RegionManager.rootScanner scanning meta region {server: >> >> >>> 10.202.50.111:60020, >> >> >>> > regionname: -ROOT-,,0, startKey: <>} >> >> >>> > 2010-11-23 07:21:20,257 INFO >> >> org.apache.hadoop.hbase.master.BaseScanner: >> >> >>> > RegionManager.rootScanner scan of 1 row(s) of meta region {serv= er: >> >> >>> > 10.202.50.111:60020, regionname: -ROOT-,,0, startKey: <>} >> complete >> >> >>> > 2010-11-23 07:21:22,838 INFO >> >> org.apache.hadoop.hbase.master.BaseScanner: >> >> >>> > RegionManager.metaScanner scanning meta region {server: >> >> >>> 10.202.50.101:60020, >> >> >>> > regionname: .META.,,1, startKey: <>} >> >> >>> > 2010-11-23 07:21:24,731 INFO >> >> org.apache.hadoop.hbase.master.BaseScanner: >> >> >>> > RegionManager.metaScanner scan of 2086 row(s) of meta region >> {server: >> >> >>> > 10.202.50.101:60020, regionname: .META.,,1, startKey: <>} >> complete >> >> >>> > 2010-11-23 07:21:24,731 INFO >> >> org.apache.hadoop.hbase.master.BaseScanner: >> >> >>> All >> >> >>> > 1 .META. region(s) scanned >> >> >>> > >> >> >>> > But I always got: >> >> >>> > hbase(main):004:0> disable 'packageindex' >> >> >>> > NativeException: org.apache.hadoop.hbase.RegionException: Retri= es >> >> >>> exhausted, >> >> >>> > it took too long to wait for the table packageindex to be >> disabled. >> >> >>> > >> >> >>> > What should I do to disable the table ? >> >> >>> > >> >> >>> > Thanks >> >> >>> > >> >> >>> >> >> >> >> >> >> >> >> > >> >> >> > >> >