hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amogh Vasekar <am...@yahoo-inc.com>
Subject RE: Best Idea to deal with following situation
Date Tue, 29 Sep 2009 10:54:44 GMT
Along with partitioner, try to plug in a combiner. It would provide significant performance
gains. Not sure about the algo you use, but might have to tweak that a little to facilitate
a combiner.

Thanks,
Amogh

-----Original Message-----
From: Chandraprakash Bhagtani [mailto:cpbhagtani@gmail.com] 
Sent: Tuesday, September 29, 2009 12:25 PM
To: common-user@hadoop.apache.org
Cc: core-user@hadoop.apache.org
Subject: Re: Best Idea to deal with following situation

you can write your custom partitioner instead of hash partitioner

On Sat, Sep 26, 2009 at 6:18 AM, Pankil Doshi <forpankil@gmail.com> wrote:

> Hello everyone,
>
> I have job whose result has  only 5 keys but but each key has long list of
> values like in 100000's .
> What should be best way to deal with it. I feel few of my reducers get over
> loaded as two or more keys go to same reduce and hence they have lots of
> work to do.
>
> So what should be best way out with this situation?
>
> Pankil
>



-- 
Thanks & Regards,
Chandra Prakash Bhagtani,

Mime
View raw message