hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Rio <driodei...@gmail.com>
Subject Re: sort example
Date Sun, 17 May 2009 15:52:24 GMT
I thought about that.. but there has to be a better way.
And it seems to work just fine in the streaming docs. Particulary the
IPs example.

-drd

On Sun, May 17, 2009 at 10:39 AM, Ricky Ho <rho@adobe.com> wrote:
> Is this a workaround ?
>
> If you know the max size of your key, can you make all keys the same size by prepending
them with zeros ...
>
> So ...
> 1324 becomes 001324
> 212 becomes 000212
> 123123
>
> After you do the sorting, trim out the preceding zeros ...
>
> Rgds,
> Ricky
> -----Original Message-----
> From: David Rio [mailto:driodeiros@gmail.com]
> Sent: Sunday, May 17, 2009 8:34 AM
> To: core-user@hadoop.apache.org
> Subject: Re: sort example
>
> On Sun, May 17, 2009 at 10:18 AM, Ricky Ho <rho@adobe.com> wrote:
>>
>> I think using a single reducer causes the sorting to be done sequentially and hence
defeats the purpose of using Hadoop in the first place.
>
> I agree, but this is just for testing.
> Actually I used two reducers in my example.
>
>> Perhaps you can use a different "partitioner" which partitions the key range >
into different subranges, with a different reducer work on each subrange.
>
> Yes, but prior to that, I want to make the basic numerical sorting
> work. It seems my args do not get passed to the partitioner class for
> some reason.
>
> -drd
>

Mime
View raw message