hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: NSRE due to duplicate assignment (MSG_REGION_CLOSE_WITHOUT_REPORT)
Date Sun, 19 Jul 2009 22:34:45 GMT
Thanks for the log though not at DEBUG.

This is a clean checkout?

What I see is a split, and then we assign out the lower half of the split
twice but we don't assign the top half.  We had a bug like this after
hbase-1304 went in but was fixed long time back.

The MSG_REGION_CLOSE_WITHOUT_REPORT is rare but we don't seem to be doing
the right thing when we get one.

St.Ack


On Sun, Jul 19, 2009 at 3:16 PM, stack <stack@duboce.net> wrote:

> Then you would have missed this fix where edits to .META. were frozen out
> making it double-assignment more likely:
>
> ------------------------------------------------------------------------
> r794867 | stack | 2009-07-16 14:29:05 -0700 (Thu, 16 Jul 2009) | 1 line
>
> HBASE-1664 Disable 1058 on catalog tables
>
> Thanks for your patience and for living on the edge/TRUNK.
>
> St.Ack
> P.S. Would be interested in your master log nonetheless
>
>
>
>
> On Sun, Jul 19, 2009 at 3:07 PM, Haijun Cao <haijuncao@ymail.com> wrote:
>
>>
>> Yes, as recent as: Jul 16 13:48
>>
>>
>> Haijun
>>
>>
>> ________________________________
>> From: stack <stack@duboce.net>
>> To: hbase-user@hadoop.apache.org
>> Sent: Sunday, July 19, 2009 2:35:48 PM
>> Subject: Re: NSRE due to duplicate assignment
>> (MSG_REGION_CLOSE_WITHOUT_REPORT)
>>
>> Are you on a recent TRUNK?  A few fixes went in end of last week that help
>> with this.
>>
>>
>> On Sun, Jul 19, 2009 at 1:24 PM, Haijun Cao <haijuncao@ymail.com> wrote:
>>
>> >
>> > I checked the .META. for the region, it indeed has two
>> >  assignment records.
>> >
>> > I am wondering if this is a bug? How I can recover the region from this?
>> (I
>> > searched archieve using duplicate assignment, got no result).
>> >
>>
>> May I see the master log from around the double assignment (if you were
>> running DEBUG).
>>
>> Yeah, its a bug.
>>
>> Do as Ryan suggested or in shell do "close_region REGIONNAME".  It'll be
>> reassigned and then reopened elsewhere.
>>
>> St.Ack
>>
>>
>>
>> >
>> > I am on hbase truck, hadoop-0.20.0 (plus 4681), zookeeper-3.2, test env
>> has
>> >  3 machine (8core, 16G, 4x750G SATA disk, raid 0). DataNode
>> xreciver=4096,
>> > handler=50, ulimit 32768 (followed hbase-0.20.0-alpha
>> overview_description
>> > religiously)
>> >
>> >
>> > Thanks in advance.
>> >
>> > Haijun
>> >
>> >
>> >
>> > 1. Exception while scanning:
>> >
>> > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
>> contact
>> > region server 10.10.30.106:60020 for region
>> > TestTable,0089182778,1247979707102, row '0089182778', but failed after
>> 10
>> > attempts.
>> > Exceptions:
>> > org.apache.hadoop.hbase.NotServingRegionException:
>> > org.apache.hadoop.hbase.NotServingRegionException:
>> > TestTable,0089182778,1247979707102
>> >        at
>> >
>> >
>>  org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2230)
>> >        at
>> >
>> >
>>  org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1848)
>> >        at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>> >        at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >        at java.lang.reflect.Method.invoke(Method.java:597)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:643)
>> >        at
>> >
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:913)
>> >
>> > 2. duplicate assignments for the region in .META.
>> >
>> > Timestamp
>> > Event
>> > Description
>> > Sat, 18 Jul 2009 22:05:00 open Region opened on server: snv-it-lin-012
>> >
>> > Sat, 18 Jul 2009 22:04:57 assignment Region assigned to server
>> > snv-it-lin-012,60020,1247965643087
>> > Sat, 18 Jul 2009 22:04:54 assignment Region assigned to server
>> > snv-it-lin-012,60020,1247965643087
>> > Sat, 18 Jul 2009 22:04:49 split Region split
>> > from:TestTable,0089182778,1247904130413
>> >
>> > 3. Region server log file:
>> >
>> > [haijun@snv-it-lin-012 ~]$ grep TestTable,0089182778,1247979707102
>> >
>>  /disk1/opt/kindsight/hbase/hbase/logs/hbase-haijun-regionserver-snv-it-lin-012.log.2009-07-18
>> > 2009-07-18 22:04:54,014 INFO
>> > org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
>> > TestTable,0089182778,1247979707102
>> > 2009-07-18 22:04:54,015 INFO
>> > org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
>> MSG_REGION_OPEN:
>> > TestTable,0089182778,1247979707102
>> > 2009-07-18 22:04:57,085 INFO
>> > org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
>> > TestTable,0089182778,1247979707102
>> > 2009-07-18 22:05:00,077 INFO
>> >  org.apache.hadoop.hbase.regionserver.HRegion: region
>> >  TestTable,0089182778,1247979707102/1884010304 available; sequence id is
>> > 57144455
>> > 2009-07-18 22:05:00,100 INFO
>> > org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
>> MSG_REGION_OPEN:
>> > TestTable,0089182778,1247979707102
>> > 2009-07-18 22:05:03,242 INFO
>> > org.apache.hadoop.hbase.regionserver.HRegionServer:
>> > MSG_REGION_CLOSE_WITHOUT_REPORT: TestTable,0089182778,1247979707102:
>> > Duplicate assignment
>> > 2009-07-18 22:05:03,242 INFO
>> > org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
>> > MSG_REGION_CLOSE_WITHOUT_REPORT: TestTable,0089182778,1247979707102:
>> > Duplicate assignment
>> > 2009-07-18 22:05:03,243 INFO
>> org.apache.hadoop.hbase.regionserver.HRegion:
>> > Closed TestTable,0089182778,1247979707102
>> >
>> >
>> >
>>
>>
>>
>>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message