hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From samir das mohapatra <samir.help...@gmail.com>
Subject Re: Need help optimizing reducer
Date Tue, 05 Mar 2013 06:23:57 GMT
  I think  you have to use partitioner to spawn more then one reducer for
small data set.
  Default Partitioner will allow you only one reducer, you have to
overwrite and implement you own logic to spawn more then one reducer.

On Tue, Mar 5, 2013 at 1:27 AM, Austin Chungath <austincv@gmail.com> wrote:

> Hi all,
> I have 1 reducer and I have around 600 thousand unique keys coming to it.
> The total data is only around 30 mb.
> My logic doesn't allow me to have more than 1 reducer.
> It's taking too long to complete, around 2 hours. (till 66% it's fast then
> it slows down/ I don't really think it has started doing anything till 66%
> but then why does it show like that?).
> Are there any job execution parameters that can help improve reducer
> performace?
> Any suggestions to improve things when we have to live with just one
> reducer?
> thanks,
> Austin

View raw message