hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From praveenesh kumar <praveen...@gmail.com>
Subject how to find top N values using map-reduce ?
Date Sat, 02 Feb 2013 05:05:44 GMT
I am looking for a better solution for this.

1 way to do this would be to find top N values from each mappers and
then find out the top N out of them in 1 reducer.  I am afraid that
this won't work effectively if my N is larger than number of values in
my inputsplit (or mapper input).

Otherway is to just sort all of them in 1 reducer and then do the cat of top-N.

Wondering if there is any better approach to do this ?

Regards
Praveenesh

Mime
View raw message