hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billy Pearson" <sa...@pearsonwholesale.com>
Subject reduce task failing after 24 hours waiting
Date Thu, 26 Mar 2009 02:23:29 GMT
I am seeing on one of my long running jobs about 50-60 hours that after 24 
hours all
active reduce task fail with the error messages

java.io.IOException: Task process exit with nonzero status of 255.
 at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418)

Is there something in the config that I can change to stop this?

Every time with in 1 min of 24 hours they all fail at the same time.
waist a lot of resource downloading the map outputs and merging them again.


View raw message