hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Urso <antho...@cs.ucla.edu>
Subject Re: Restricting number of records from map output
Date Wed, 12 Jan 2011 19:13:32 GMT
Either use an instance variable or a Combiner.  The latter is correct
if you want the top-n per key from the mapper.

On Wed, Jan 12, 2011 at 10:03 AM, Rakesh Davanum <rakeshdav@gmail.com> wrote:
> Hi,
> I have a sort job consisting of only the Mapper (no Reducer) task. I want my
> results to contain only the top n records. Is there any way of restricting
> the number of records that are emitted by the Mappers?
> Basically I am looking to see if there is an equivalent of achieving
> the behavior similar to LIMIT in SQL queries.
> Thanks & Regards,
> Rakesh

View raw message