hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohith Sharma K S <rohithsharm...@huawei.com>
Subject RE: Time out after 600 for YARN mapreduce application
Date Wed, 11 Feb 2015 10:31:57 GMT
Looking into attemptID, this is mapper task getting timed out in MapReduce job.  The configuration
that can be used to increase the value is 'mapreduce.task.timeout'.

The task timed out is because if there is no heartbeat from MapperTask(YarnChild) to MRAppMaster
for 10 mins.  Does MR job is custom job?  If so any operation are you doing in cleanup() of
Mapper ? Sometimes there would be possible that if cleanup() of Mapper is taking more time
greater than timedout configured that result in task to timeout.


Thanks & Regards
Rohith Sharma K S
From: Alexandru Pacurar [mailto:Alexandru.Pacurar@PropertyShark.com]
Sent: 11 February 2015 15:34
To: user@hadoop.apache.org
Subject: Time out after 600 for YARN mapreduce application

Hello,

I keep encountering an error when running nutch on hadoop YARN:

AttemptID:attempt_1423062241884_9970_m_000009_0 Timed out after 600 secs

Some info on my setup. I'm running a 64 nodes cluster with hadoop 2.4.1. Each node has 4 cores,
1 disk and 24Gb of RAM, and the namenode/resourcemanager has the same specs only with 8 cores.

I am pretty sure one of these parameters is to the threshold I'm hitting:

yarn.am.liveness-monitor.expiry-interval-ms
yarn.nm.liveness-monitor.expiry-interval-ms
yarn.resourcemanager.nm.liveness-monitor.interval-ms

but I would like to understand why.

The issue usually appears under heavier load, and most of the time the on the next attempts
it is successful. Also if I restart the Hadoop cluster the error goes away for some time.

Thanks,
Alex

Mime
View raw message