hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Question on region server/data node restart
Date Tue, 24 Feb 2009 15:07:01 GMT
Correcting myself, no waiting time regards the time to figure the node is
dead. It will still have to fetch the region location in META.

J-D


On Tue, Feb 24, 2009 at 10:02 AM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:

> Well if a region server dies instead of being cleanly shut down, it takes
> in the worst case 180 seconds (a region server lease length) before the
> Master reassigns the regions. Clients trying to connect to that server will
> take IIRC 10 seconds to figure the node is down then the time to communicate
> with ROOT and META is under 1 sec. If META wasn't updated yet, it will retry
> all of that.
>
> In the next release (0.20.0), the master is notified by Zookeeper in the
> following seconds of a region server death and will proceed to reassign the
> regions immediately.
>
> If the client don't have the region in cache and META is updated with the
> region server death, there will be no waiting time.
>
> J-D
>
>
> On Tue, Feb 24, 2009 at 9:49 AM, Michael Dagaev <michael.dagaev@gmail.com>wrote:
>
>> Thanks, now it is clear.
>>
>> However, if a region server is down, it takes a lot of time to retry
>> first,
>> to rescan the META region when the retries fail, rescan ROOT, etc. to
>> get eventually to another region server, which will handle the request.
>> Is it correct ?
>>
>> On Tue, Feb 24, 2009 at 4:36 PM, Jean-Daniel Cryans <jdcryans@apache.org>
>> wrote:
>> > This is why we have a META table, it holds the location info. See
>> > http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture#client
>> >
>> > J-D
>> >
>> > On Tue, Feb 24, 2009 at 9:28 AM, Michael Dagaev <
>> michael.dagaev@gmail.com>wrote:
>> >
>> >> Thanks, Jean-Daniel.
>> >>
>> >> I did run hbase-daemon stop regionserver and start regionserver
>> >> and saw the client retrying to connect to the restarted region server.
>> >>
>> >> How does it know to connect to another region server ? Maybe it stops
>> >> retrying, asks master, and get another region server to connect to.
>> >> Is it correct ?
>> >>
>> >> Thank you for your cooperation,
>> >> M.
>> >>
>> >> On Tue, Feb 24, 2009 at 3:56 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org>
>> >> wrote:
>> >> > Michael,
>> >> >
>> >> > Regards stopping those nodes, do it using hadoop-daemon/hbase-daemon
>> to
>> >> stop
>> >> > them cleanly. Requests from the clients will not "fail", they will
>> simply
>> >> be
>> >> > told to look elsewhere for the regions they have in cache. Unless you
>> >> only
>> >> > have 1 region server...
>> >> >
>> >> > Regards starting the nodes, apart from the usual
>> >> hadoop-daemon/hbase-daemon,
>> >> > no.
>> >> >
>> >> > J-D
>> >> >
>> >> > On Tue, Feb 24, 2009 at 8:50 AM, Michael Dagaev <
>> >> michael.dagaev@gmail.com>wrote:
>> >> >
>> >> >> Hi, all
>> >> >>
>> >> >>     As I understand, I can stop a region server and a data node
in a
>> >> >> cluster
>> >> >> "semi-transparently" for clients, i. e. the requests handled  by
the
>> >> >> region server
>> >> >> at that time will fail, but cluster will be working.
>> >> >>
>> >> >> If I start the data node and region server  I do not have to do
>> anything
>> >> to
>> >> >> make
>> >> >> them work.
>> >> >>
>> >> >> Is it correct ?
>> >> >>
>> >> >> Thank you for your cooperation,
>> >> >> M.
>> >> >>
>> >> >
>> >>
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message