hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michel Segel <michael_se...@hotmail.com>
Subject Re: When node is down
Date Mon, 25 Jun 2012 03:14:58 GMT
You don't notice it faster, it's the timeout. 
You can reduce the timeout, it's configurable. Default is 10 min.

There shouldn't be downtime of the cluster, just the node.

Note this is for Apache. MapR is different and someone from MapR should be able to provide
details...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 22, 2012, at 8:41 AM, Tom Brown <tombrown52@gmail.com> wrote:

> Can it notice the node is down sooner? If that node is serving an active
> region (or if it's a datanode for an active region), that would be a
> potentially large amount of downtime.  With comodity hardware, and a large
> enough cluster, there will always be a machine or two being rebuilt...
> 
> Thanks!
> 
> -Tom
> 
> On Thursday, June 21, 2012, Michael Segel wrote:
> 
>> Assuming that you have an Apache release (Apache, HW, Cloudera) ...
>> (If MapR, replace the drive and you should be able to repair the cluster
>> from the console. Node doesn't go down. )
>> Node goes down.
>> 10 min later, cluster sees node down. Should then be able to replicate the
>> missing blocks.
>> 
>> Replace disk w new disk and rebuild file system.
>> Bring node up.
>> Rebalance cluster.
>> 
>> That should be pretty much it.
>> 
>> 
>> On Jun 21, 2012, at 10:17 PM, David Charle wrote:
>> 
>>> What is the best practice to remove a node and add the same node back for
>>> hbase/hadoop ?
>>> 
>>> Currently in our 10 node cluster; 2 nodes went down (bad disk, so node is
>>> down as its the root volume+data); need to replace the disk and add them
>>> back. Any quick suggestions or pointers to doc for the right procedure ?
>>> 
>>> --
>>> David
>> 
>> 

Mime
View raw message