hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Is it normal that a dead data node causes 3 region servers to go down?
Date Mon, 25 Jun 2012 16:06:02 GMT
On Mon, Jun 25, 2012 at 12:52 PM, Peter Naudus <pnaudus@dataraker.com>wrote:

> Is it normal that when a data node goes down it brings down 3 region
> servers with it? I was under the impression that the HBase region servers
> had some kind of failover mechanism that would prevent this. Since there
> are multiple copies of the data stored, it doesn't make sense that the
> inaccessibility of one copy causes all other copies to also become
> inaccessible.
>
>
You don't say what version of hbase/hadoop.

Usually the dfsclient just moves on to the next replica.

Looking at your logs, its not clear whats going on.  You have it
complaining with ClosedChannelException which is to be expected... Then it
seems someone tried to shutdown the server:

java.io.InterruptedIOException: Aborting compaction of store D in region
fpl,P.1596002_TS3600_D.1304226000,1334196560513.
4106274c5a8852493fc20d2e50a7e428. because user requested stop.

Then, we are trying to close the WAL but it fails:

12/06/21 12:39:24 ERROR hdfs.DFSClient: Exception closing file /hbase/.logs/
xxxxxxpnb003.dataraker.net,60020,1338412070372/xxxxxxpnb003.dataraker.net
%3A60020.1340272250127 : java.io.IOException: Error Recovery for block
blk_-8250761279849076686_833850 failed because recovery from primary
datanode 10.128.204.129:50010 failed 6 times. Pipeline was
xxx.xxx.xxx.xxx:50010. Aborting...

Are you running an old hadoop, perhaps one that does not support hbase?

Yours,
St.Ack

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message