hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Unresponsive master in Hbase 0.90.0
Date Mon, 31 Jan 2011 19:07:36 GMT
Its not in the manual yet Vidhya.  Assignment has completely changed
in 0.90.  No more do we assign by adding payload to the heartbeat.
Now we assign by direct rpc from master to regionserver with master
and regionserver moving the region through state changes up in zk
until region is successfully opened (OFFLINE->OPENING->OPENED -- with
OPENING done a few times to make sure there has not been intercession
because operations were taking too long).

Let me explain more:

In new master, on fresh startup (as opposed to a master joining an
already running cluster), after waiting on regionserver checkin and
having assigned root and meta, we'll take a distinct 'fresh startup'
code path.  We scan .META. for all entries.  We then give all entries
to the load balancer.  It produces an assignment plan that is per
regionserver based.  We then fire up a little executor service that
will run a bounded number of threads concurrently.  Each running
thread manages the bulk assign of regions to a particular
regionserver.

Assignment now can be in general a little slower because all state
transitions are mediated via zookeeper rather than in-memory in the
master.  But, in this special bulk assign startup mode, we make use of
zk's async ops so we do bulk state transition changes up in zk rather
manage individual changes so all runs faster.  There is a new rpc
where we can dump on the RS all the regions its to open.  ZK timeouts
during this startup phase are all extended.

The bulk assign threads per regionserver stay up until all regions
have opened on an individual regionserver, then the executor finishes
and the next runs (We could be better here -- especially on a cluster
of 700 nodes).  I spent time timing this stuff and I'd say bulk assign
even with the async zk ops is probably slower than how we used do it
but not by much.

The new master logs are very different than the old so it might take a
while getting your head around whats going on.  Hopefully you can
avoid having to do this.

What are you seeing?

St.Ack



On Mon, Jan 31, 2011 at 10:19 AM, Vidhyashankar Venkataraman
<vidhyash@yahoo-inc.com> wrote:
> Yes, I will file an issue after collecting the right logs.
>
> We will try finding the cause of the META server choke.
>
> Another question: the master still seems to be taking (a lot of) time to load the table
during startup: I found that the regions percheckin config variable isnt used anymore. I havent
looked at that part of the code yet, but what is now the master's part in assigning regions
in 0.90? (Can you let me know if they are explained in Hbase docs in the release?)
>
> Thank you
> Vidhya
>
> On 1/31/11 10:06 AM, "Stack" <stack@duboce.net> wrote:
>
> On Mon, Jan 31, 2011 at 9:54 AM, Vidhyashankar Venkataraman
> <vidhyash@yahoo-inc.com> wrote:
>> The Hbase cluster doesn't have the master problems with hadoop-append turned on:
we will try finding out why it wasn't working with a non-append version of hadoop (with a
previous version of hadoop, it was getting stuck while splitting logs).
>>
>
> I'd say don't bother Vidhya.  You should run w/ append anyways.
> Meantime file an issue if you don't mind and dump in there your data.
> We need to take care of this so others don't trip over what you saw.
> I'm sure that plenty of users will innocently try to bring up 0.90.0
> on an Hadoop w/o append.
>
>> But there are other issues now (with append turned on) which we are trying to resolve.
The region server that's hosting the META region is getting choked after a table was loaded
with around 100 regions per server (this is likely the target load that we wanted to have
and this worked in 0.89 with the same number of nodes and Hbase 0.90 worked fine with 40 nodes
and that's why I started straight with this number). The node can be pinged, but not accessible
through ssh and I am unable to perform most hbase operations on the cluster as a result.
>>
>>   Can the RS hosting META be a potential bottleneck in the system at all? (I will
try shutting down that particular node and see what happens).
>>
>
> At 700 nodes scale, its quite possible we're doing something dumb.
> Any data you can glean to help us here would be appreciated.  I'd have
> thought that 0.90.0 would put less load on .META. since we've removed
> some of the reasons for .META. access.
>
> St.Ack
>
>

Mime
View raw message