hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rottinghuis, Joep" <jrottingh...@ebay.com>
Subject RE: Dananode not sending the Hearbeat messages to Namenode
Date Wed, 03 Aug 2011 13:50:02 GMT
When nodes are not reporting heartbeats, can you ssh into them?
Can they see the JT machine?
What does netstat -a show?


From: Rahul Das [rahul.hdpq@gmail.com]
Sent: Tuesday, August 02, 2011 11:21 PM
To: hdfs-user@hadoop.apache.org
Subject: Dananode not sending the Hearbeat messages to Namenode


I found a strange behavior in my cluster. The data nodes stop sending any information randomly
(no logs coming). So the namenode thinks its down. But after some time ( approx 30 mints)
the datanode nodes comes up and start behaving properly. I tried finding any error log, but
the datanode node is not writing any error message during this time.

The Namenode shows some warning similar to

2011-07-28 20:59:35,275 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: PendingReplicationMonitor
timed out block blk_8370263993564715002_23947922

I checked this is not happening due to network outage or some other process eating up the

Please help me with this.

View raw message