Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 94014 invoked from network); 14 Jul 2009 23:09:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Jul 2009 23:09:56 -0000 Received: (qmail 75781 invoked by uid 500); 14 Jul 2009 23:10:05 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 75729 invoked by uid 500); 14 Jul 2009 23:10:05 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 75719 invoked by uid 99); 14 Jul 2009 23:10:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Jul 2009 23:10:05 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ryanobjc@gmail.com designates 209.85.210.185 as permitted sender) Received: from [209.85.210.185] (HELO mail-yx0-f185.google.com) (209.85.210.185) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Jul 2009 23:09:55 +0000 Received: by yxe15 with SMTP id 15so5752966yxe.5 for ; Tue, 14 Jul 2009 16:09:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=2cwMjOGLR7lgrVFJTB9R2QfFrhbbFzYe8cv+cXJz2mk=; b=Z1SdPRPovJSNxtPypYsJkaYvtUpaRcQNdoOxLymmZvVzLtxe3adCinazE3sOyalMjD VfzDTolQokwzoaPtINgLZTOOk3JwXZZfF66GWHndxbpNXQihB/WzmCqCcL2JXIAKKkJR 3kzPPZfkVkdGNo+VcTK3rw072WLaqaYxOoaSs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=CsIIQncJqvOlGKpzjt2NLutAEBHXuq5S4mG+M7mWKUNtkyZZoaKy4au7diH7+kXv7L 15DeRj9PKSsher6FKnDj/cuovZihhdTocJAN0vUGJEZOlW4Ptud2uMW0r52GC3H0MWZ6 4IUzXC41240ylR/W6Iv3wexNiDF6pepHKjDF0= MIME-Version: 1.0 Received: by 10.150.230.1 with SMTP id c1mr11181110ybh.216.1247612974484; Tue, 14 Jul 2009 16:09:34 -0700 (PDT) In-Reply-To: <396674.10316.qm@web59902.mail.ac4.yahoo.com> References: <797499.56386.qm@web59912.mail.ac4.yahoo.com> <78568af10907141514q31eaec9nc9c6ea35e925fb1c@mail.gmail.com> <396674.10316.qm@web59902.mail.ac4.yahoo.com> Date: Tue, 14 Jul 2009 16:09:34 -0700 Message-ID: <78568af10907141609g6b0de0e8k9ccaebc0e6babdf6@mail.gmail.com> Subject: Re: java.io.IOException: HRegionInfo was null or empty in .META. From: Ryan Rawson To: hbase-dev@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org It's a known problem with the .META. table involving old deleted rows and new rows. There is a brief summary here: http://issues.apache.org/jira/browse/HBASE-1383 I believe that we are making truncate in the shell actually issue the major_compact itself too now. On Tue, Jul 14, 2009 at 3:40 PM, Haijun Cao wrote: > Ryan, > > It worked. thank you! > > I have used truncate command on TestTable before, but I then decided to d= rop the table completely and start the test from scratch. This problem seem= to happen after the hbase server crashed. I experienced both hbase-1634/HB= ASE-1638 and this count issue. Running the script in 1638 resolve the NPE p= roblem (Thanks stack), running flush/major_compact meta table resolved the = count issue. > > For my education, do you mind explaining the cause/fix for this count iss= ue I am seeing? > > Thanks > Haijun > > > > > ________________________________ > From: Ryan Rawson > To: hbase-dev@hadoop.apache.org > Sent: Tuesday, July 14, 2009 3:14:33 PM > Subject: Re: java.io.IOException: HRegionInfo was null or empty in .META. > > Try doing this on the shell: > > flush '.META.' > major_compact '.META.' > > did you use 'truncate' command at any point? > > On Tue, Jul 14, 2009 at 3:10 PM, Haijun Cao wrote: >> >> Hi >> >> I am running hbase PE test, loaded TestTable with 10M records without an= y problem, later hbase crashed (during another sequentialWrite test), after= restart, I can't count TestTable, it always get stuck at 1.1 mil records. >> >> >> I am wondering if anybody has encountered the same problem (data got cor= rupted after server crash)? How do you recover? At this point, it is just t= est data, so it is ok for me to lose some data. Is there a way to drop/repa= ir the bad region (0001192603)? Does hbase have a auto-repair tool like fsc= k? >> >> Your help is greatly appreciated. >> >> Haijun >> >> >> Error stack: >> >> Current count: 1100000, row: 0001099999 >> NativeException: java.lang.RuntimeException: org.apache.hadoop.hbase.cli= ent.RetriesExhaustedException: Trying to contact region server null for reg= ion , row '0001192603', but failed after 5 attempts. >> Exceptions: >> java.io.IOException: HRegionInfo was null or empty in .META. >> java.io.IOException: HRegionInfo was null or empty in .META. >> java.io.IOException: HRegionInfo was null or empty in .META. >> java.io.IOException: HRegionInfo was null or empty in .META. >> java.io.IOException: HRegionInfo was null or empty in .META. >> >> =A0 =A0 =A0 =A0from org/apache/hadoop/hbase/client/HTable.java:2083:in `= hasNext' >> =A0 =A0 =A0 =A0from sun.reflect.GeneratedMethodAccessor9:-1:in `invoke' >> =A0 =A0 =A0 =A0from sun/reflect/DelegatingMethodAccessorImpl.java:25:in = `invoke' >> =A0 =A0 =A0 =A0from java/lang/reflect/Method.java:597:in `invoke' >> =A0 =A0 =A0 =A0from org/jruby/javasupport/JavaMethod.java:298:in `invoke= WithExceptionHandling' >> =A0 =A0 =A0 =A0from org/jruby/javasupport/JavaMethod.java:259:in `invoke= ' >> =A0 =A0 =A0 =A0from org/jruby/java/invokers/InstanceMethodInvoker.java:3= 6:in `call' >> =A0 =A0 =A0 =A0from org/jruby/runtime/callsite/CachingCallSite.java:70:i= n `call' >> =A0 =A0 =A0 =A0from org/jruby/ast/CallNoArgNode.java:61:in `interpret' >> =A0 =A0 =A0 =A0from org/jruby/ast/WhileNode.java:127:in `interpret' >> =A0 =A0 =A0 =A0from org/jruby/ast/NewlineNode.java:104:in `interpret' >> =A0 =A0 =A0 =A0from org/jruby/ast/BlockNode.java:71:in `interpret' >> =A0 =A0 =A0 =A0from org/jruby/internal/runtime/methods/InterpretedMethod= .java:163:in `call' >> =A0 =A0 =A0 =A0from org/jruby/internal/runtime/methods/DefaultMethod.jav= a:144:in `call' >> =A0 =A0 =A0 =A0from org/jruby/runtime/callsite/CachingCallSite.java:273:= in `cacheAndCall' >> =A0 =A0 =A0 =A0from org/jruby/runtime/callsite/CachingCallSite.java:112:= in `call' >> >> >> I scan the .META. table and noticed that region 0001192603 has no info c= olumns, only historian columns. I also check the hadoop file system, region= 0001192603 has a oldlogfile.log file in its directory (other regions don't= ): >> >> Found 3 items >> -rw-r--r-- =A0 3 bamboo supergroup =A0 =A0 =A0 =A0619 2009-07-13 16:28 /= user/bamboo/hbase/TestTable/954997373/.regioninfo >> drwxr-xr-x =A0 - bamboo supergroup =A0 =A0 =A0 =A0 =A00 2009-07-13 16:28= /user/bamboo/hbase/TestTable/954997373/info >> -rw-r--r-- =A0 3 bamboo supergroup =A0 39417817 2009-07-13 16:52 /user/b= amboo/hbase/TestTable/954997373/oldlogfile.log >> >> >> > > > >