hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Venner <jason.had...@gmail.com>
Subject Re: MapRed Job Completes; Output Ceases Mid-Job
Date Thu, 08 Oct 2009 14:01:18 GMT
Are you perhaps creating large numbers of files, and running out of file
descriptors in your tasks.

On Wed, Oct 7, 2009 at 1:52 PM, Geoffry Roberts
<geoffry.roberts@gmail.com>wrote:

> All,
>
> I have a MapRed job that ceases to produce output about halfway through.
> The obvious question is why?
>
> This job reads a file and uses MultipleTextOutputFormat to generate output
> files named with the output key.  At about the halfway point, the job
> continues to create files, but they are all of zero length.    I've worked
> with this input file extensively and I know it actually contains the
> required data and that it is clean or at least it was when I copied it in.
>
> My first impulse was to check for a full disk, but there seems to be ample
> free space.
>
> This doesn't appear to have anything to do with my code.
>
> stderror is full of the following entry:
>
> java.io.EOFException
>
> 	at java.io.DataInputStream.readByte(DataInputStream.java:250)
> 	at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
> 	at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
> 	at org.apache.hadoop.io.Text.readString(Text.java:400)
>
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2837)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2762)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
>
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)
>
>
> syslog for the reducer starts filling up with the following at what could
> indeed be the halfway point:
>
> 2009-10-07 11:27:50,874 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException
>
> 2009-10-07 11:27:50,916 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-1693260904457793456_3495
> 2009-10-07 11:27:56,919 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException
>
> 2009-10-07 11:27:56,919 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_7536254999085848659_3495
> 2009-10-07 11:28:02,921 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException
>
> 2009-10-07 11:28:02,921 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-7513223558440754487_3495
> 2009-10-07 11:28:08,924 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException
>
> 2009-10-07 11:28:08,924 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_2580888829875117043_3495
> 2009-10-07 11:28:14,965 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
>
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2781)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
> 	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)
>
>
>


-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Mime
View raw message