hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Owen O'Malley <omal...@apache.org>
Subject Re: I am attempting to use setOutputValueGroupingComparator as a secondary sort on the values
Date Tue, 28 Oct 2008 15:38:36 GMT

On Oct 28, 2008, at 7:53 AM, David M. Coe wrote:

> My mapper is Mapper<LongWritable, Text, IntWritable, IntWritable>  
> and my
> reducer is the identity.  I configure the program using:
> conf.setOutputKeyClass(IntWritable.class);
> conf.setOutputValueClass(IntWritable.class);
> conf.setMapperClass(MapClass.class);
> conf.setReducerClass(IdentityReducer.class);
> conf.setOutputKeyComparatorClass(IntWritable.Comparator.class);
> conf.setOutputValueGroupingComparator(IntWritable.Comparator.class);

The problem is that your map needs to look like:

class IntPair implements Writable {
   private int left;
   private int right;
   public void set(int left, int right) { ... }
   public int getLeft() {...}
   public int getRight() {...}

your Mapper should be Mapper<LongWritable, Text, IntPair, IntWritable>  
and should emit

IntPair key = new IntPair();
IntegerWritable value = new IntegerWritale();
key.set(keyValue, valueValue);
output.collect(key, value);

Your sort comparator should take compare both left and right in the  
The grouping comparator should only look at left in the pair.

Your Reducer should be Reducer<IntPair, IntWritable, IntWritable,  

output.collect(key.getLeft(), value);

Is that clearer?

-- Owen

View raw message