hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghu Angadi <ang...@gmail.com>
Subject Re: region goes missing on rs (may be during reassignment)
Date Fri, 13 May 2011 22:01:54 GMT
Thanks Stack. greatly appreciate the help.

hbase.regionserver.handler.count is set to 30.
we have not set hbase.master.assignment.timeoutmonitor.timeout. will surely
increase to 180 seconds as HBASE-3846 does.

The load on the cluster is low to moderate and HBase holds up pretty well.
Most of the load consists of hourly random writes to the table and
sequential scans from MR jobs.

I will send another email with locations to full master logs.
There are many "Regions in transition timed out" messages for this region
and many others spread over time.

Raghu.

On Fri, May 13, 2011 at 11:33 AM, Stack <stack@duboce.net> wrote:

> I see that we are timing out region assignment then assigning
> elsewhere, but the region opened anyway on first server (What do you
> have hbase.regionserver.handler.count set to?  The default is 10 which
> could mean a bunch of requests hanging out in the rpc queue before
> getting into the server to be processed).  One thing you could do is
> up your region in transition timeout.  Default is 30 seconds which if
> there is a bunch of churn may not be enough time for region assignment
> to complete -- was there churn at this time? (We up the default
> timeout in 0.90.3, see  'HBASE-3846  Set RIT timeout higher').
>
> See below for more.
>
> On Fri, May 13, 2011 at 8:19 AM, Raghu Angadi <rangadi@apache.org> wrote:
> ...
> >> > 2011-05-12 12:05:20,987 DEBUG
> >> > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened
> >> > users,61364002,1297594642368.a0bf035ac417cdd0697464f1c48f387f.
>
> The region opened successfully.
>
> But looking at the master log, 12 seconds earlier it says:
>
> >>>> 2011-05-12 12:05:08,122 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition
> timed out:  users,61364002,1297594642368.a0bf035ac417cdd0697464f1c48f387f.
> state=3DOPENING, ts=3D1305201871850 2011-05-12 12:05:08,122 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPENING
> for too long, reassigning
>
>
> .... and then forces it reasssigned elsewhere (Your log from master
> stops at this point.  I'd be interested in seeing more.  Send it to me
> offline?).
>
> Thanks Raghu,
> St.Ack
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message