hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tarandeep Singh" <tarand...@gmail.com>
Subject Re: questions on sorting big files and sorting order
Date Wed, 27 Aug 2008 17:51:46 GMT
On Tue, Aug 26, 2008 at 7:50 AM, Owen O'Malley <omalley@apache.org> wrote:

> On Tue, Aug 26, 2008 at 12:39 AM, charles du <taiping.du@gmail.com> wrote:
>
> > I would like to sort a large number of records in a big file based on a
> > given field (key).
>
>
> The property you are looking for is a "total order" and you need to define
> your own partitioner class to do it. Look at the terasort example and how I
> did it in that program. Roughly, before the job the input is sampled and
> the
> proper split points are chosen.  When each partitioner picks where each key
> should go, it looks at the split points and sends it to the right reduce.
>
> *http://tinyurl.com/5ltb2a


how to sort if key is not text, say my records are -
abc 10 30.5 lmn
....
and I want to sort on field 2 and 3

Can you pls give some pointers on how to modify your original partitioner
class to handle this case.

thanks,
Taran

<http://tinyurl.com/5ltb2a>
>
> -- Owen
> *
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message