hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghava Mutharaju <m.vijayaragh...@gmail.com>
Subject Re: log files not found
Date Sat, 03 Apr 2010 04:15:17 GMT
Hi all,

      I have found the log files on the DataNodes. I have checked the
userlogs, but they do not contain any exception related to the error I have
mentioned in the previous email (I am putting it here again).

10/04/01 01:04:15 INFO mapred.JobClient: Task Id :
attempt_201003240138_0110_r_
000018_1, Status : FAILED
Task attempt_201003240138_0110_r_000018_1 failed to report status for 602
seconds. Killing!

I have also done some tests by changing the order of the jobs. After the 3rd
job, any job which is run after it fails at reduce 99% with the above
message (with different attempt IDs ofcourse). I guess the number of input
files does not matter -- at that point there are 130 input files (which is
taken as input for 4th job).

I am at a loss on how to proceed with this. Happy to get any pointers :)

Thank you.

Regards,
Raghava.

On Thu, Apr 1, 2010 at 2:24 AM, Raghava Mutharaju <m.vijayaraghava@gmail.com
> wrote:

> Hi all,
>
>        I am running a series of jobs one after another. While executing the
> 4th job, the job fails. It fails in the reducer --- the progress percentage
> would be map 100%, reduce 99%. It gives out the following message
>
>
> 10/04/01 01:04:15 INFO mapred.JobClient: Task Id :
> attempt_201003240138_0110_r_000018_1, Status : FAILED
> Task attempt_201003240138_0110_r_000018_1 failed to report status for 602
> seconds. Killing!
>
> It makes several attempts again to execute it but fails with similar
> message. I couldn't get anything from this error message and wanted to look
> at logs (located in the default dir of ${HADOOP_HOME/logs}). But I don't
> find any files which match the timestamp of the job. Also I did not find
> history and userlogs in the logs folder. Should I look at some other place
> for the logs? What could be the possible causes for the above error?
>
>        I am using Hadoop 0.20.2 and I am running it on a cluster with 14
> nodes.
>
> Thank you.
>
> Regards,
> Raghava.
>

Mime
View raw message