hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value
Date Thu, 04 Apr 2013 23:34:16 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Karthik Kambatla updated MAPREDUCE-5028:

    Attachment: repro-mr-5028.patch

Arun, uploaded a patch (repro-mr-5028.patch) with unit tests that reproduce the problem. Unfortunately,
the methods causing the interger overflow do not have integer arithmetic stubbed out and so
the tests actually end up creating large buffers. 

I don't think we should include them, but I am sure they will come handy to make sure the
posted patch actually fixes the issue. The tests cover 3 of the 4 changes - the 4th is embedded
too deep inside to write a unit test for.

If we want to include the tests, we might have to stub out the array index calculations to
other methods which doesn't look very useful either.
> Maps fail when io.sort.mb is set to high value
> ----------------------------------------------
>                 Key: MAPREDUCE-5028
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 1.1.1, 2.0.3-alpha, 0.23.5
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>            Priority: Critical
>             Fix For: 1.2.0, 2.0.5-beta
>         Attachments: mr-5028-branch1.patch, mr-5028-branch1.patch, mr-5028-branch1.patch,
mr-5028-trunk.patch, mr-5028-trunk.patch, mr-5028-trunk.patch, repro-mr-5028.patch
> Verified the problem exists on branch-1 with the following configuration:
> Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280,
> Run teragen to generate 4 GB data
> Maps fail when you run wordcount on this configuration with the following error: 
> {noformat}
> java.io.IOException: Spill failed
> 	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031)
> 	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692)
> 	at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> 	at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45)
> 	at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.io.EOFException
> 	at java.io.DataInputStream.readInt(DataInputStream.java:375)
> 	at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38)
> 	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
> 	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
> 	at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
> 	at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
> 	at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505)
> 	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438)
> 	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855)
> 	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346)
> {noformat}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message