hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sameer Paranjpye (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-156) Reducer threw IOEOFException
Date Tue, 30 May 2006 23:35:30 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-156?page=all ]

Sameer Paranjpye updated HADOOP-156:
------------------------------------

    Fix Version: 0.4
        Version: 0.3
    Description: 
A job was running with all the map tasks completed.
The reducers were appending the intermediate files into the large intermediate file.
java.io.EOFException was thrown when the record reader tried to read the version number
during initialization. Here is the stack trace:

java.io.EOFException 
    at java.io.DataInputStream.readFully(DataInputStream.java:178) 
    at java.io.DataInputStream.readFully(DataInputStream.java:152) 
    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:251) 
    at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:236) 
    at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:226) 
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:205) 
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:709) 

Appearantly, the intermediate file was empty. I suspect that one map task
generated empty intermidiate files for all the reducers, since all the reducers
failed at the same place, and failed at the same place during retries.

Unfortunately, we cannot know which map task generated the empty files,
since the exception does not offer any clue.

One simple enhancement is that the record reader should catch IOException and re-throw it
with additional 
information, such as the file name.



  was:

A job was running with all the map tasks completed.
The reducers were appending the intermediate files into the large intermediate file.
java.io.EOFException was thrown when the record reader tried to read the version number
during initialization. Here is the stack trace:

java.io.EOFException 
    at java.io.DataInputStream.readFully(DataInputStream.java:178) 
    at java.io.DataInputStream.readFully(DataInputStream.java:152) 
    at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:251) 
    at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:236) 
    at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:226) 
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:205) 
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:709) 

Appearantly, the intermediate file was empty. I suspect that one map task
generated empty intermidiate files for all the reducers, since all the reducers
failed at the same place, and failed at the same place during retries.

Unfortunately, we cannot know which map task generated the empty files,
since the exception does not offer any clue.

One simple enhancement is that the record reader should catch IOException and re-throw it
with additional 
information, such as the file name.




> Reducer  threw IOEOFException
> -----------------------------
>
>          Key: HADOOP-156
>          URL: http://issues.apache.org/jira/browse/HADOOP-156
>      Project: Hadoop
>         Type: Bug

>   Components: mapred
>     Versions: 0.3
>     Reporter: Runping Qi
>      Fix For: 0.4

>
> A job was running with all the map tasks completed.
> The reducers were appending the intermediate files into the large intermediate file.
> java.io.EOFException was thrown when the record reader tried to read the version number
> during initialization. Here is the stack trace:
> java.io.EOFException 
>     at java.io.DataInputStream.readFully(DataInputStream.java:178) 
>     at java.io.DataInputStream.readFully(DataInputStream.java:152) 
>     at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:251) 
>     at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:236) 
>     at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:226) 
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:205) 
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:709) 
> Appearantly, the intermediate file was empty. I suspect that one map task
> generated empty intermidiate files for all the reducers, since all the reducers
> failed at the same place, and failed at the same place during retries.
> Unfortunately, we cannot know which map task generated the empty files,
> since the exception does not offer any clue.
> One simple enhancement is that the record reader should catch IOException and re-throw
it with additional 
> information, such as the file name.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message