hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From arkady borkovsky <ark...@yahoo-inc.com>
Subject Re: Pre-sort value list in reduce
Date Tue, 15 Apr 2008 07:32:30 GMT
look at
  -partitioner org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner

--ab

On Apr 14, 2008, at 4:25 PM, pi song wrote:

> Dear people in Hadoop mailing list,
>
> Is there any way to control the value list in reduce (Key, List of  
> values)
> to be sorted? or at least clusteringly sorted (containing clusters  
> of sorted
> values e.g. 1,1,1,2,2,2,2,3,3,3,  1,1,1,1,1,1,2,2,2,2,3
> ,1,1,2,2,2,3,3,3,3,3,3,3) ?
> I had a look at JobConf.setOutputValueGroupingComparator in javadoc  
> and I
> think it might be the answer because I feel most of the time  
> grouping in
> Hadoop is done by sort. Am I right?
>
> Can anyone help me? How about the performance impact of your solution?
>
> Thanks in advance,
> Pi


Mime
View raw message