hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "charles du" <taiping...@gmail.com>
Subject questions on sorting big files and sorting order
Date Tue, 26 Aug 2008 07:39:33 GMT
Hi all:

I would like to sort a large number of records in a big file based on a
given field (key).

If I run just one reducer, it works fine because the reducer will sort all
records based on the key. To increase the sorting performance, I would like
to run multiple reducers, how can I guarantee the order of records that got
partitioned to different reducers?

Also, the default order is ascending. How can I program my reducer to output
records in descending order? My key could be IntWritable, or Text. Thanks.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message