hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amogh Vasekar <am...@yahoo-inc.com>
Subject Re: Question on GroupingComparatorClass
Date Wed, 27 Jan 2010 19:13:39 GMT
I think combiner gets only the keys sort comparator, not the grouping comparator. So I believe
the default grouping is used on combiner, but custom on reducer.
Here's a relevant snipped of code :
      super(inputCounter, conf, reporter);
      combinerClass = cls;
      keyClass = (Class<K>) job.getMapOutputKeyClass();
      valueClass = (Class<V>) job.getMapOutputValueClass();
      comparator = (RawComparator<K>) job.getOutputKeyComparator();


On 1/26/10 12:57 PM, "Jim Twensky" <jim.twensky@gmail.com> wrote:


I'm using a custom grouping comparator class to simulate a secondary
sort on values, and I set it via Job.setGroupingComparatorClass (using
Hadoop 0.20.x) inside my driver. I'm wondering if this class is also
used when grouping the records in the combiner.

Using a combiner greatly improves the performance in my case, but for
the combiners, I want to use the default comparator, not the custom
one that I use before  the actual reduce.

Is there a way to just set the custom grouping comparator for the
reduce and bypass it during the combine stage?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message