hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Patterson <j...@cloudera.com>
Subject Re: how to sort the output by value in reduce instead of by key?
Date Mon, 11 Apr 2011 14:09:22 GMT
Leibnitz,
I think you are looking for "secondary sort" in this case where the
data arrives in some sort of order at the reducer as opposed to "in a
group by key". Is that the case?

For a look at secondary sort I've got a few blog articles:

http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/
http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/
http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/

and part 3 includes source code on github.com:

https://github.com/jpatanooga/Caduceus

Hope that helps,

Josh



On Mon, Apr 11, 2011 at 5:26 AM, leibnitz <se3g2011@gmail.com> wrote:
> can anyone get me a tips ?
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/how-to-sort-the-output-by-value-in-reduce-instead-of-by-key-tp2805541p2805922.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>



-- 
Twitter: @jpatanooga
Solution Architect @ Cloudera
hadoop: http://www.cloudera.com
blog: http://jpatterson.floe.tv

Mime
View raw message