hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Zaliva <kroko...@gmail.com>
Subject Re: single output file
Date Wed, 16 Jan 2008 01:11:14 GMT

On Jan 15, 2008, at 17:02, Rui Shi wrote:

> As far as I understand, let mapper produce top N records is not  
> working
> as each mapper only has partial knowledge of the data, which will  
> not lead to
> global optimal... I think your mapper needs to output all records
> (combined) and let the reducer to pick the top N values.

the question remains, how to return, say, last 10 records from Reducer.
I need to know when last record is processed.

Vadim

Mime
View raw message