Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 463 invoked from network); 12 Mar 2009 17:11:08 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 Mar 2009 17:11:08 -0000 Received: (qmail 50389 invoked by uid 500); 12 Mar 2009 17:11:05 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 50376 invoked by uid 500); 12 Mar 2009 17:11:05 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 50364 invoked by uid 99); 12 Mar 2009 17:11:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Mar 2009 10:11:05 -0700 X-ASF-Spam-Status: No, hits=3.7 required=10.0 tests=HTML_MESSAGE,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of zsongbo@gmail.com designates 209.85.146.179 as permitted sender) Received: from [209.85.146.179] (HELO wa-out-1112.google.com) (209.85.146.179) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Mar 2009 17:10:56 +0000 Received: by wa-out-1112.google.com with SMTP id v27so656296wah.29 for ; Thu, 12 Mar 2009 10:10:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=l9c5MngMkhGXiQ6DnJpxnFI1rKmQDkJB4UJGux7/fqQ=; b=hVrEh1G3gOLPiqiK9xiakbCGdFtakozQOSoYwOPkKBA7EgXtYszs49EbTRXZ26SIMb yk/caxFWbU2g0IJSVQ+eLWN/Dw0p/kDtxfyZlHQD0+VchSC27bdgiJqiNVDl/XwYKN9U cBXppK52me2rlxsRQ3ZSYrIwWhqI3gP5dnZ7Q= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=mEuRqZ4G5gBbU4JYG2BR8WBx5Dqw47QVDl1frpbh3yP91jLlw7qrnR8pB3FEtb3w1Z JEoX5LO0UYRpPLsvFQaJ1zOvDWo3GC3DwgWKEn7eXwn6NlS4vWDtQgvqGupwsQdq6Ec7 u5UcmKo4zDDxV6ecZcfHgAv7XuZ6Fq6XoMkCY= MIME-Version: 1.0 Received: by 10.115.58.1 with SMTP id l1mr106041wak.191.1236877835103; Thu, 12 Mar 2009 10:10:35 -0700 (PDT) In-Reply-To: <7c962aed0903120929v58b66f45p600d0d7e93940eb6@mail.gmail.com> References: <7c962aed0903120929v58b66f45p600d0d7e93940eb6@mail.gmail.com> Date: Fri, 13 Mar 2009 01:10:35 +0800 Message-ID: Subject: Re: Metadata and region mismatch From: schubert zhang To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e64b076afdf0b00464ef0f44 X-Virus-Checked: Checked by ClamAV on apache.org --0016e64b076afdf0b00464ef0f44 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Thank you stack, it seems HBASE-1121.I will continue to track it. Sorry for the log files have been removed. On Fri, Mar 13, 2009 at 12:29 AM, stack wrote: > Hey Schubert: > > Just FYI, after noticing the mismatch, rather than restart the whole > cluster, you might try closing the single region. That can jog the master > into noticing it has a bad assignment. To do this, in the shell type > 'tools' and you'll see some admin facility. > > The root problem seems to be an issue fixed in the new hbase 0.19.1 release > candidate: See HBASE-1121 'Cluster confused about where -ROOT- is'. > > Worrying is that even after a restart, you cannot get to the troublesome > region. Is it deployed on a regionserver? If so, anything pertinent in > the > logs regards this region? > > St.Ack > > On Thu, Mar 12, 2009 at 4:31 AM, schubert zhang wrote: > > > oh, it is not fine. > > Now, I can find: > > TESTTABLE,13575565132@2008-12-01 > > 17:16:55.117,1236847258901< > > > http://nd0-rack0-cloud:60010/regionhistorian.jsp?regionname=WAPCDR,13575565132@2008-12-01%2017:16:55.117,1236847258901 > > > > > nd1-rack0-cloud:60020 916003194 > > 13575565132@2008-12-01 17:16:55.117 13576301358@2008-12-08 13:57:43.163 > > > > but when I try to get get 13575565132@2008-12-01 17:16:55.117, nothing > > returned. It seems this region is gone. > > > > > > On Thu, Mar 12, 2009 at 7:09 PM, schubert zhang > wrote: > > > > > Hi all, > > > Today, I encounter a new issue about failure to batchUpdate commit. > > > > > > I am running a program to insert rows into a HBase table, but after > long > > > time of batchUpdating, following exception occur: > > > > > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to > > contact > > > region server Some server for region TESTTABLE,13575565132@2008-12-0117 > :16:55.117,1236847258901, > > row '13575581009@2008-12-0606:15:48.077', but failed after 10 attempts. > > > Exceptions: > > > at > > > > > > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:942) > > > at > > > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1372) > > > at > org.apache.hadoop.hbase.client.HTable.close(HTable.java:1385) > > > ...... > > > > > > And after waiting for a long time, I still cannot insert new data. > > > > > > Then, I check the HBase status, all master and regionservers are > running. > > > > > > But, I find a mismatch about region "TESTTABLE,13575565132@2008-12-0117 > > :16:55.117,1236847258901". > > > In the metadata, I found it said this region is severed by 10.24.1.12, > > but > > > when I check into 10.24.1.12, there is no this region. > > > And then, I stop all HBase cluster and start it. Regions locations are > > > re-structured and seems everything is OK. > > > > > > In the log file of 10.24.1.12, I found following exceptions: > > > > > > 836118938_60020/hlog.dat.1236849158178, entries=100010. New log writer: > > > /hbase/log_10.24.1.12_1236836118938_60020/hlog.dat.1236849168393 > > > 2009-03-12 17:12:49,298 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > > compaction completed on region TESTTABLE,13575565132@2008-12-0117 > :16:55.117,1236847258901 > > in 48sec > > > 2009-03-12 17:12:49,298 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > > Starting split of region TESTTABLE,13575565132@2008-12-0117 > > :16:55.117,1236847258901 > > > 2009-03-12 17:12:50,648 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > > Closed TESTTABLE,13575565132@2008-12-01 17:16:55.117,1236847258901 > > > 2009-03-12 17:12:50,809 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > > region TESTTABLE,13575565132@2008-12-0117 > :16:55.117,1236849169299/1762744366 > > available > > > 2009-03-12 17:12:50,809 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > > Closed TESTTABLE,13575565132@2008-12-01 17:16:55.117,1236849169299 > > > 2009-03-12 17:12:50,865 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > > region TESTTABLE,13575590622@2008-12-1615 > :49:40.143,1236849169299/1344805089 > > available > > > 2009-03-12 17:12:50,865 INFO > > org.apache.hadoop.hbase.regionserver.HRegion: > > > Closed TESTTABLE,13575590622@2008-12-16 15:49:40.143,1236849169299 > > > 2009-03-12 17:29:15,495 WARN org.apache.hadoop.hbase.RegionHistorian: > > > Unable to 'Region split from: WAPCDR,13575565132@2008-12-0117 > > :16:55.117,1236847258901' > > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to > > contact > > > region server Some server for region , row > > 'TESTTABLE,13575565132@2008-12-0117:16:55.117,1236849169299', but failed > > after 11 attempts. > > > Exceptions: > > > org.apache.hadoop.hbase.NotServingRegionException: > > > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0 > > > at > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065) > > > at > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1546) > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > at > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > > at > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > at > > > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) > > > at > > > > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) > > > > > > org.apache.hadoop.hbase.NotServingRegionException: > > > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0 > > > at > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065) > > > at > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1546) > > > at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source) > > > at > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > at > > > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) > > > at > > > > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) > > > > > > org.apache.hadoop.hbase.NotServingRegionException: > > > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0 > > > > > > --0016e64b076afdf0b00464ef0f44--