hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peeyush Bishnoi <peeyu...@yahoo-inc.com>
Subject Re: Detect Dead DataNode
Date Mon, 29 Dec 2008 13:49:42 GMT

On Mon, 2008-12-29 at 03:35 -0800, Sandeep Dhawan wrote:

> Hi,
> 
> I have a setup of 2-node Hadoop cluster running on Windows using cygwin. 
> When I open up the web gui to view the number of Live Nodes, it shows 2. 
> But when I kill the slave node and refreshes the gui, it still shows the
> number of Live Nodes as 2.
> 
> Its only after some 20-30 mins, that the master node is able to detect the
> failure which is then reflected in the gui. It then shows up :
> 
> Live Node : 1
> Dead Node : 1
> 
> Also, after killing the slave datanode if I try to copy a file from the
> local file system, it fails. 
> 
> 1. Is there a way by which we can configure the time interval after which
> master node can declare a datanode as dead.

Ans: I think this can be done on the basis of heartbeat . If master
node , does not able to receive the heartbeat within time interval from
the datanode , it consider as problematic node .
See parameter "dfs.heartbeat.interval" in hadoop-default.xml 


> 2. Why does the file transfer fail when one of the slave node is dead and
> masternode is alive.
>     

Ans: If one of the slave node is dead , then still the data should be
stored if another slave node is alive. It will be better if you paste
the error message you got while copying the data .

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message