hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Sreekumar <hsreeku...@clickable.com>
Subject Re: Tasks seem to fail randomly with nonzero status of 1
Date Wed, 02 Mar 2011 10:23:03 GMT
Did this happen just once or it happens every time? This usually happens
when the Child processes are forcibly killed. If it was a one-off thing, it
is possible that someone else working on your machine at the same time
killed the processes. If it happens every time, then it could be due to lack
of system resources. Maybe unix is killing these processes because they are
eating too much RAM?

On Wed, Mar 2, 2011 at 3:45 PM, Marc Sturlese <marc.sturlese@gmail.com>wrote:

> Hey there,
> My cluster was working fine but suddenly lots and lots of tasks start
> failing like:
>
> java.lang.Throwable: Child Error
>        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:472)
> Caused by: java.io.IOException: Task process exit with nonzero status of 1.
>        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:459)
>
> I restarted the whole cluster but since it happened once its getting broken
> every time I run a job.
> Any clue or advice?
> Thanks in advance.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Tasks-seem-to-fail-randomly-with-nonzero-status-of-1-tp2612433p2612433.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message