Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 39471 invoked from network); 19 Jul 2009 20:29:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 19 Jul 2009 20:29:03 -0000 Received: (qmail 20782 invoked by uid 500); 19 Jul 2009 20:30:08 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 20709 invoked by uid 500); 19 Jul 2009 20:30:07 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 20699 invoked by uid 99); 19 Jul 2009 20:30:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 19 Jul 2009 20:30:07 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ryanobjc@gmail.com designates 209.85.210.185 as permitted sender) Received: from [209.85.210.185] (HELO mail-yx0-f185.google.com) (209.85.210.185) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 19 Jul 2009 20:29:58 +0000 Received: by yxe15 with SMTP id 15so3118458yxe.5 for ; Sun, 19 Jul 2009 13:29:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=Fb4RI8xyPEFtiuknIsC9xDO8xmrG8TjxhXe0v7JPSmw=; b=JCPzSUvXtBZEggY9X41FsBNWmoorysROrvrCJpYFFzc1gfkwAExCFJeEQIz4/3WdfM HmtLXsE43sKFvEP3SO/VmccFTtO2XuTxtHVNRYCtJABAhnj1URd5271BpHH3Mq/Uz39h RHF7whPtSx6s6lhYsLvgjwdt6XmSBCA23TcwM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=X2s+lPcXl6Dv41Rw1vUSIQ2F0IAqncG5HoJN4ma7gYpVhNEeqY97lvPcY8GYpsBDiG yO2CyYNVTRnfK3SJP8z0j6EgvR0i6DCG3tumYZ5wZ98nunLosf2N7E92SFIKcxIkhz3I BEvEfiW0uSXL2c0fTbNP1bd1HorHkz3LVvnqc= MIME-Version: 1.0 Received: by 10.150.121.5 with SMTP id t5mr5559713ybc.40.1248035377862; Sun, 19 Jul 2009 13:29:37 -0700 (PDT) In-Reply-To: <586727.83283.qm@web59912.mail.ac4.yahoo.com> References: <586727.83283.qm@web59912.mail.ac4.yahoo.com> Date: Sun, 19 Jul 2009 13:29:37 -0700 Message-ID: <78568af10907191329s595ee8ecs664c7f31ae34b3ff@mail.gmail.com> Subject: Re: NSRE due to duplicate assignment (MSG_REGION_CLOSE_WITHOUT_REPORT) From: Ryan Rawson To: hbase-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org A quick recover is to kill your master with 'kill' (not hbase-daemon.sh). Then restart it. If that doesn't work, you might have to manually delete the regionserver assignment in meta: deleteall '.META.', 'TestTable,0089182778,1247979707102', 'info:server' The master will reassign the region within 60 seconds. Let us know! -ryan On Sun, Jul 19, 2009 at 1:24 PM, Haijun Cao wrote: > > > > > Hi > > > I am experiencing the NSRE exception (however, not all NSRE is created eq= ual, so it seems) while scanning TestTable, TestTable is previously populat= ed with sequentialWrite 100x1M records (using PerformanceEvaluation map red= uce). > > I checked the region in exception and found that the region is not served= because region sever is complaining about duplicate assignment: > MSG_REGION_CLOSE_WITHOUT_REPORT: TestTable,0089182778,1247979707102: Dupl= icate assignment > > I checked the .META. for the region, it indeed has two > =A0assignment records. > > I am wondering if this is a bug? How I can recover the region from this? = (I searched archieve using duplicate assignment, got no result). > > I am on hbase truck, hadoop-0.20.0 (plus 4681), zookeeper-3.2, test env h= as > =A03 machine (8core, 16G, 4x750G SATA disk, raid 0). DataNode xreciver=3D= 4096, handler=3D50, ulimit 32768 (followed hbase-0.20.0-alpha overview_desc= ription religiously) > > > Thanks in advance. > > Haijun > > > > 1. Exception while scanning: > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to conta= ct region server 10.10.30.106:60020 for region TestTable,0089182778,1247979= 707102, row '0089182778', but failed after 10 attempts. > Exceptions: > org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbas= e.NotServingRegionException: TestTable,0089182778,1247979707102 > =A0 =A0 =A0 =A0at > =A0org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionSe= rver.java:2230) > =A0 =A0 =A0 =A0at > =A0org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegion= Server.java:1848) > =A0 =A0 =A0 =A0at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown So= urce) > =A0 =A0 =A0 =A0at sun.reflect.DelegatingMethodAccessorImpl.invoke(Delegat= ingMethodAccessorImpl.java:25) > =A0 =A0 =A0 =A0at java.lang.reflect.Method.invoke(Method.java:597) > =A0 =A0 =A0 =A0at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseR= PC.java:643) > =A0 =A0 =A0 =A0at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBa= seServer.java:913) > > 2. duplicate assignments for the region in .META. > > Timestamp > Event > Description > Sat, 18 Jul 2009 22:05:00 open Region opened on server: snv-it-lin-012 > > Sat, 18 Jul 2009 22:04:57 assignment Region assigned to server snv-it-lin= -012,60020,1247965643087 > Sat, 18 Jul 2009 22:04:54 assignment Region assigned to server snv-it-lin= -012,60020,1247965643087 > Sat, 18 Jul 2009 22:04:49 split Region split from:TestTable,0089182778,12= 47904130413 > > 3. Region server log file: > > [haijun@snv-it-lin-012 ~]$ grep TestTable,0089182778,1247979707102 =A0/di= sk1/opt/kindsight/hbase/hbase/logs/hbase-haijun-regionserver-snv-it-lin-012= .log.2009-07-18 > 2009-07-18 22:04:54,014 INFO org.apache.hadoop.hbase.regionserver.HRegion= Server: MSG_REGION_OPEN: TestTable,0089182778,1247979707102 > 2009-07-18 22:04:54,015 INFO org.apache.hadoop.hbase.regionserver.HRegion= Server: Worker: MSG_REGION_OPEN: TestTable,0089182778,1247979707102 > 2009-07-18 22:04:57,085 INFO org.apache.hadoop.hbase.regionserver.HRegion= Server: MSG_REGION_OPEN: TestTable,0089182778,1247979707102 > 2009-07-18 22:05:00,077 INFO > =A0org.apache.hadoop.hbase.regionserver.HRegion: region > =A0TestTable,0089182778,1247979707102/1884010304 available; sequence id i= s 57144455 > 2009-07-18 22:05:00,100 INFO org.apache.hadoop.hbase.regionserver.HRegion= Server: Worker: MSG_REGION_OPEN: TestTable,0089182778,1247979707102 > 2009-07-18 22:05:03,242 INFO org.apache.hadoop.hbase.regionserver.HRegion= Server: MSG_REGION_CLOSE_WITHOUT_REPORT: TestTable,0089182778,1247979707102= : Duplicate assignment > 2009-07-18 22:05:03,242 INFO org.apache.hadoop.hbase.regionserver.HRegion= Server: Worker: MSG_REGION_CLOSE_WITHOUT_REPORT: TestTable,0089182778,12479= 79707102: Duplicate assignment > 2009-07-18 22:05:03,243 INFO org.apache.hadoop.hbase.regionserver.HRegion= : Closed TestTable,0089182778,1247979707102 > > >