hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marek Miglinski <mmiglin...@seven.com>
Subject RE: Reducer IO
Date Tue, 07 Feb 2012 07:52:41 GMT
Thanks for the reply,

As it turns out that didn't help, IO is used even more as each reducer is copying and sorting.
What are the options? Is there an option to limit reduce - > copy and reduce - > sort

Marek M.

From: Mostafa Gaber [moustafa.gaber@gmail.com]
Sent: Monday, February 06, 2012 6:50 PM
To: mapreduce-user@hadoop.apache.org
Subject: Re: Reducer IO

Hello Marek,

I think you can increase number of reducers for your MR job so as to reduce the amount of
intermediate key-value pairs assigned to each reducer. Note also that the number of reducers
is dependent on your job and how the output should be produced.

On Mon, Feb 6, 2012 at 11:37 AM, Marek Miglinski <mmiglinski@seven.com<mailto:mmiglinski@seven.com>>

I have a mapreduce job (transactions loader) and the main problem of it is "reduce->copy"
and "reduce->sort" phase which takes all IO and uses all disk resources, what are the possible
ways to reduce this load? My cloud settings are:


I can lower those settings, what else can I tweak?

Marek M.

Best Regards,
Mostafa Ead

View raw message