hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjay Dahiya (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-156) Reducer threw IOEOFException
Date Mon, 18 Sep 2006 12:12:23 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-156?page=comments#action_12435464 ] 
            
Sanjay Dahiya commented on HADOOP-156:
--------------------------------------

ok, the EOFException in the case I posted earlier is not a critical issue. it happened after
the job was aborted on job tracker, some task trackers cleaned up map outputs while reduce
tasks were still running on others. 

However there are genuine EOFExceptions as well which occur in sort phase on reduce tasks,
which may be due to malformed map output. e.g. following 

2006-09-15 08:04:13,900 INFO org.apache.hadoop.mapred.TaskTracker: task_0001_r_000511_0 0.33333334%
reduce > sort

2006-09-15 08:04:13,996 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000511_0 06/09/15
08:04:13 WARN mapred.TaskTracker: Error running child
2006-09-15 08:04:13,996 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000511_0 java.io.EOFException
2006-09-15 08:04:13,996 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000511_0  at
java.io.DataInputStream.readFully(DataInputStream.java:178)
2006-09-15 08:04:13,996 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000511_0  at
java.io.DataInputStream.readFully(DataInputStream.java:152)
2006-09-15 08:04:13,996 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000511_0  at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:952)
2006-09-15 08:04:13,996 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000511_0  at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:937)
2006-09-15 08:04:13,996 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000511_0  at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:928)
2006-09-15 08:04:13,996 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000511_0  at
org.apache.hadoop.io.SequenceFile$Sorter$SortPass.run(SequenceFile.java:1594)
2006-09-15 08:04:13,996 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000511_0  at
org.apache.hadoop.io.SequenceFile$Sorter.sortPass(SequenceFile.java:1523)
2006-09-15 08:04:13,996 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000511_0  at
org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:1496)
2006-09-15 08:04:13,996 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000511_0  at
org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:240)
2006-09-15 08:04:13,996 INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000511_0  at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1165)


> Reducer  threw IOEOFException
> -----------------------------
>
>                 Key: HADOOP-156
>                 URL: http://issues.apache.org/jira/browse/HADOOP-156
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.3.0
>            Reporter: Runping Qi
>
> A job was running with all the map tasks completed.
> The reducers were appending the intermediate files into the large intermediate file.
> java.io.EOFException was thrown when the record reader tried to read the version number
> during initialization. Here is the stack trace:
> java.io.EOFException 
>     at java.io.DataInputStream.readFully(DataInputStream.java:178) 
>     at java.io.DataInputStream.readFully(DataInputStream.java:152) 
>     at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:251) 
>     at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:236) 
>     at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:226) 
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:205) 
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:709) 
> Appearantly, the intermediate file was empty. I suspect that one map task
> generated empty intermidiate files for all the reducers, since all the reducers
> failed at the same place, and failed at the same place during retries.
> Unfortunately, we cannot know which map task generated the empty files,
> since the exception does not offer any clue.
> One simple enhancement is that the record reader should catch IOException and re-throw
it with additional 
> information, such as the file name.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message