Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 60629 invoked from network); 13 Mar 2009 10:10:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 13 Mar 2009 10:10:30 -0000 Received: (qmail 19065 invoked by uid 500); 13 Mar 2009 10:10:28 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 19048 invoked by uid 500); 13 Mar 2009 10:10:27 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 19037 invoked by uid 99); 13 Mar 2009 10:10:27 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Mar 2009 03:10:27 -0700 X-ASF-Spam-Status: No, hits=3.7 required=10.0 tests=HTML_MESSAGE,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of zsongbo@gmail.com designates 209.85.146.181 as permitted sender) Received: from [209.85.146.181] (HELO wa-out-1112.google.com) (209.85.146.181) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Mar 2009 10:10:17 +0000 Received: by wa-out-1112.google.com with SMTP id v27so968545wah.29 for ; Fri, 13 Mar 2009 03:09:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=w+KMMoiCgyybNwJV2OYFe27Eu/Zzns0WmNiFpylOyoc=; b=jBXC6ezm0l9wgvoVZiAehc4gvZqawQBkvu31LbHYCrXO5JA/eUAaWRUg2oApygs641 94pgCCDegIbMLe13AGYlKr3zJ1KSGZwIwK7CILYscIN+Oq5re4HwCD0/zJz9VyB6k/n6 nvZzAJl2GZY2+vqCnbn2+cEfXdNdWcmTq+FfU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=dKs7UvcnZbfon6y1bfeHNyRUYkUmEUlpEwyrnYivqFS18qSwfIf7tAGESzUBtlc3g3 rbK1rPb0GgK4cr3K0k/cLl79Z+01R4aincQ0C4KxjTtgNNDDqGDbAnSHFlpaM2Fj0IvZ ZTdJAK8KDsc9smB77s+c03LS5M72e/z+/rqTY= MIME-Version: 1.0 Received: by 10.115.47.1 with SMTP id z1mr766396waj.133.1236938996436; Fri, 13 Mar 2009 03:09:56 -0700 (PDT) In-Reply-To: References: <7c962aed0903120929v58b66f45p600d0d7e93940eb6@mail.gmail.com> Date: Fri, 13 Mar 2009 18:09:56 +0800 Message-ID: Subject: Re: Metadata and region mismatch From: schubert zhang To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=00163646ce787dd1270464fd4da4 X-Virus-Checked: Checked by ClamAV on apache.org --00163646ce787dd1270464fd4da4 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit This time, I have another region missed, and I use close_region 'REGIONNAME' to close it. but then all regions after this one missed on the web GUI, but I can find them when scan '.META.':-(notes: This case, there is no log infos form -ROOT- table. On Fri, Mar 13, 2009 at 1:10 AM, schubert zhang wrote: > Thank you stack, it seems HBASE-1121.I will continue to track it. Sorry > for the log files have been removed. > > > On Fri, Mar 13, 2009 at 12:29 AM, stack wrote: > >> Hey Schubert: >> >> Just FYI, after noticing the mismatch, rather than restart the whole >> cluster, you might try closing the single region. That can jog the master >> into noticing it has a bad assignment. To do this, in the shell type >> 'tools' and you'll see some admin facility. >> >> The root problem seems to be an issue fixed in the new hbase 0.19.1 >> release >> candidate: See HBASE-1121 'Cluster confused about where -ROOT- is'. >> >> Worrying is that even after a restart, you cannot get to the troublesome >> region. Is it deployed on a regionserver? If so, anything pertinent in >> the >> logs regards this region? >> >> St.Ack >> >> On Thu, Mar 12, 2009 at 4:31 AM, schubert zhang >> wrote: >> >> > oh, it is not fine. >> > Now, I can find: >> > TESTTABLE,13575565132@2008-12-01 >> > 17:16:55.117,1236847258901< >> > >> http://nd0-rack0-cloud:60010/regionhistorian.jsp?regionname=WAPCDR,13575565132@2008-12-01%2017:16:55.117,1236847258901 >> > > >> > nd1-rack0-cloud:60020 916003194 >> > 13575565132@2008-12-01 17:16:55.117 13576301358@2008-12-08 13:57:43.163 >> > >> > but when I try to get get 13575565132@2008-12-01 17:16:55.117, nothing >> > returned. It seems this region is gone. >> > >> > >> > On Thu, Mar 12, 2009 at 7:09 PM, schubert zhang >> wrote: >> > >> > > Hi all, >> > > Today, I encounter a new issue about failure to batchUpdate commit. >> > > >> > > I am running a program to insert rows into a HBase table, but after >> long >> > > time of batchUpdating, following exception occur: >> > > >> > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to >> > contact >> > > region server Some server for region >> TESTTABLE,13575565132@2008-12-0117:16:55.117,1236847258901, >> > row '13575581009@2008-12-0606:15:48.077', but failed after 10 attempts. >> > > Exceptions: >> > > at >> > > >> > >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:942) >> > > at >> > > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1372) >> > > at >> org.apache.hadoop.hbase.client.HTable.close(HTable.java:1385) >> > > ...... >> > > >> > > And after waiting for a long time, I still cannot insert new data. >> > > >> > > Then, I check the HBase status, all master and regionservers are >> running. >> > > >> > > But, I find a mismatch about region >> "TESTTABLE,13575565132@2008-12-0117 >> > :16:55.117,1236847258901". >> > > In the metadata, I found it said this region is severed by 10.24.1.12, >> > but >> > > when I check into 10.24.1.12, there is no this region. >> > > And then, I stop all HBase cluster and start it. Regions locations are >> > > re-structured and seems everything is OK. >> > > >> > > In the log file of 10.24.1.12, I found following exceptions: >> > > >> > > 836118938_60020/hlog.dat.1236849158178, entries=100010. New log >> writer: >> > > /hbase/log_10.24.1.12_1236836118938_60020/hlog.dat.1236849168393 >> > > 2009-03-12 17:12:49,298 INFO >> > org.apache.hadoop.hbase.regionserver.HRegion: >> > > compaction completed on region TESTTABLE,13575565132@2008-12-0117 >> :16:55.117,1236847258901 >> > in 48sec >> > > 2009-03-12 17:12:49,298 INFO >> > org.apache.hadoop.hbase.regionserver.HRegion: >> > > Starting split of region TESTTABLE,13575565132@2008-12-0117 >> > :16:55.117,1236847258901 >> > > 2009-03-12 17:12:50,648 INFO >> > org.apache.hadoop.hbase.regionserver.HRegion: >> > > Closed TESTTABLE,13575565132@2008-12-01 17:16:55.117,1236847258901 >> > > 2009-03-12 17:12:50,809 INFO >> > org.apache.hadoop.hbase.regionserver.HRegion: >> > > region TESTTABLE,13575565132@2008-12-0117 >> :16:55.117,1236849169299/1762744366 >> > available >> > > 2009-03-12 17:12:50,809 INFO >> > org.apache.hadoop.hbase.regionserver.HRegion: >> > > Closed TESTTABLE,13575565132@2008-12-01 17:16:55.117,1236849169299 >> > > 2009-03-12 17:12:50,865 INFO >> > org.apache.hadoop.hbase.regionserver.HRegion: >> > > region TESTTABLE,13575590622@2008-12-1615 >> :49:40.143,1236849169299/1344805089 >> > available >> > > 2009-03-12 17:12:50,865 INFO >> > org.apache.hadoop.hbase.regionserver.HRegion: >> > > Closed TESTTABLE,13575590622@2008-12-16 15:49:40.143,1236849169299 >> > > 2009-03-12 17:29:15,495 WARN org.apache.hadoop.hbase.RegionHistorian: >> > > Unable to 'Region split from: WAPCDR,13575565132@2008-12-0117 >> > :16:55.117,1236847258901' >> > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to >> > contact >> > > region server Some server for region , row >> > 'TESTTABLE,13575565132@2008-12-0117:16:55.117,1236849169299', but >> failed >> > after 11 attempts. >> > > Exceptions: >> > > org.apache.hadoop.hbase.NotServingRegionException: >> > > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0 >> > > at >> > > >> > >> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065) >> > > at >> > > >> > >> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1546) >> > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> > > at >> > > >> > >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> > > at >> > > >> > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> > > at java.lang.reflect.Method.invoke(Method.java:597) >> > > at >> > > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) >> > > at >> > > >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) >> > > >> > > org.apache.hadoop.hbase.NotServingRegionException: >> > > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0 >> > > at >> > > >> > >> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2065) >> > > at >> > > >> > >> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1546) >> > > at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown >> Source) >> > > at >> > > >> > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> > > at java.lang.reflect.Method.invoke(Method.java:597) >> > > at >> > > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) >> > > at >> > > >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) >> > > >> > > org.apache.hadoop.hbase.NotServingRegionException: >> > > org.apache.hadoop.hbase.NotServingRegionException: -ROOT-,,0 >> > > >> > >> > > --00163646ce787dd1270464fd4da4--