hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: MapReduce combiner issue : EOFException while reading Value
Date Mon, 11 Jun 2012 02:33:06 GMT
Hi Arpit,

Can you send across a reproducible test case we can look at? I believe
this to be a user code issue rather than a framework issue, but having
a test case helps confirm/deny that.

On Mon, Jun 4, 2012 at 5:35 PM, Arpit Wanchoo <Arpit.Wanchoo@guavus.com> wrote:
> Hi
> I have been trying to setup a map reduce job with hadoop
> Scenario :
> My mapper is writing key value pairs where I have total 13 types of keys and corresponding
value classes.
> For each input record I write all these i.e 13 key-val pair to context.
> My combiner and reducer are doing the same thing.
> Issue :
> My job is running fine when I don't use a combiner.
> But when I run with combiner , I am getting EOFException.
> java.io.EOFException
>        at java.io.DataInputStream.readUnsignedShort(Unknown Source)
>        at java.io.DataInputStream.readUTF(Unknown Source)
>        at java.io.DataInputStream.readUTF(Unknown Source)
>        at com.guavus.mapred.common.collection.ValueCollection.readFieldsLong(ValueCollection.java:40)
>        at com.guavus.mapred.common.collection.ValueCollection.readFields(ValueCollection.java:21)
>        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
>        at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
>        at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
>        at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>        at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1420)
>        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1435)
>        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:852)
>        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1343)
> My Finding :
> On checking and debugging what I got was that  the combiner reads the key successfully
but while trying to read the values it gives EOFException because it doesn't find anything
in DataInput stream. Also this is occurring when data is large and combiner runs more than
> I have noticed that the combiner is failing to get the value for this key when running
for the 2nd time . (I read somewhere that combiner begins when the some amount of data has
been written by mapper even though mapper is still writing data to context).
> I verified many times that my mapper is writing no null value. The issue looks really
strange because combiner is able to read the key but doesn't get any value in data stream.
> There is some issue with combiner as it is running fine when I don't use a combiner.
I also tried to set the combiner class to the same class which is my reducer class but still
the issue occured.
> Please suggest what could be the root cause for this or what can I do to track the root
> Regards,
> Arpit Wanchoo

Harsh J

View raw message