hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leo Leung <lle...@ddn.com>
Subject RE: Machine hangs from time to time
Date Thu, 15 Aug 2013 02:14:56 GMT
I doubt this is related to hadoop / java code,
since you mention there is no keyboard / console response and only on specific DN.

You may want to enable or check Linux abrtd, (base Linux tool) to help troubleshoot system
level crashes (if any)

Find out if this is related to hardware, such has thermal dissipation problem (running too
hot).

Hope this helps.
Good luck.


From: Chun-fan Ivan Liao [mailto:ivan@ivangelion.tw]
Sent: Wednesday, August 14, 2013 6:35 PM
To: user@hadoop.apache.org
Subject: Machine hangs from time to time

Hi,

We are using Hadoop 1.0.3 on Ubuntu 12.04.2 LTS. Hadoop servers include 1 NN/JT, 1 SNN/DN
& several DNs.

From time to time, some of the servers just hanged, cannot be pinged, screen blackened out,
not responding to keyboard input and lost connection with the NN. Lately, one DN was hanged
even when there is no job to run. Specifically, the unresponsive happens not on all machines.
It usually happens on several specific DNs.

How to tackle this problem? Does it leave a trace when the system crashes/hangs?

Any help would be greatly appreciated.


Mime
View raw message