hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Иван <...@mail.ru>
Subject Timeouts at reduce stage
Date Fri, 29 Aug 2008 10:06:06 GMT
>From time to time I'm experiencing huge decrease of performance while running some MR jobs.
The reason have revealed itself quite easily - some tasks have failed according to JobTracker's
web interface.
Record reporting such a failure usually looks somehow like this (usually appears at exact
reduce stage):
"Task task_200808270610_0085_m_000242_0 failed to report status for 600 seconds. Killing!"

In fact it doesn't seems to be somehow related with exact type of job which is currently running
- it just appears from time to time with different ones. But if that's the case - the execution
time of job becomes several times longer and finally usually results in job failure. The changing
of some configuration options like mapred.task.timeout generally only makes the death of a
job faster, but really doesn't somehow help to cure the problem.

Are there any suggestions about the possible reasons of such a behavior of mapreduce framework
or maybe someone have already experienced the same problems?

Thanks!

Ivan Blinkov


Mime
View raw message