hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-2369) Using TableMapper Iterable IntWritables not passed to the reducer in order put by mapper
Date Tue, 08 Mar 2011 17:52:59 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004090#comment-13004090
] 

Tom White commented on MAPREDUCE-2369:
--------------------------------------

MapReduce does not make guarantees about the order of the values in the iterator, since in
general records can come from different mappers at different times - just like you observed.
Instead, have a look at secondary sort (http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Secondary+Sort,
also SecondarySort.java in the examples) to see if this helps with your use case.

> Using TableMapper Iterable IntWritables not passed to the reducer in order put by mapper
> ----------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2369
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2369
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.20.2
>         Environment: Cloudera VM 3.5
>            Reporter: Bob Cummins
>            Priority: Minor
>
> For mapper class:
>       class Mapper1 extends TableMapper<ImmutableBytesWritable,IntWritable>
> With reducer class:
>      class Reducer1 extends TableReducer<ImmutableBytesWritable,IntWritable, ImmutableBytesWritable>
> Iterable<IntWritable> values are usually received by the reducer in the
> order the values are written to the context by the mapper. However in my
> testing about 5% of cases, the same order is not maintained, and the ability
> of the reducer to categorize a value by order lost.
> Chronological order guaranteed would serve as a facility for identification by the reducer.
>  
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message