hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From java8964 <java8...@hotmail.com>
Subject RE: How to configure multiple reduce jobs in hadoop 2.2.0
Date Fri, 17 Jan 2014 15:39:24 GMT
I read this blog, and have the following questions:
What is the relationship between "mapreduce.map.memory.mb" and "mapreduce.map.java.opts"?
In the blog, it gives the following settings as example:
For our example cluster, we have the minimum RAM for a Container (yarn.scheduler.minimum-allocation-mb)
= 2 GB. We’ll thus assign 4 GB for Map task Containers, and 8 GB for Reduce tasks Containers.In
Container will run JVMs for the Map and Reduce tasks. The JVM heap size should be set to lower
than the Map and Reduce memory defined above, so that they are within the bounds of the Container
memory allocated by YARN.In mapred-site.xml:1234<name>mapreduce.map.java.opts</name><value>-Xmx3072m</value><name>mapreduce.reduce.java.opts</name><value>-Xmx6144m</value>The
above settings configure the upper limit of the physical RAM that Map and Reduce tasks will
I am not sure why the "mapreduce.map.java.opts" should be lower than "mapreduce.map.memory.mb",
as suggested above, or how it makes sense.
If the JVM of mapper task is set with heap size of Max 3G, and the Container for the map task
max memory is set to 4G, then what is the usage of this additional 1G memory for?
Basically my questions are:
1) Why we have this 2 configuration settings? From what I thought, should one be enough?2)
For the above settings, my understanding is that from application, the max memory I can use
for mapper task is 3G, no matter what I asked for, right? Is the additional 1G meaning any
size I can ask outside of the JVM Heap?
Date: Fri, 17 Jan 2014 15:16:28 +0530
Subject: Re: How to configure multiple reduce jobs in hadoop 2.2.0
From: sudhakara.st@gmail.com
To: user@hadoop.apache.org

Also check this

On Fri, Jan 17, 2014 at 2:56 PM, Silvina Caíno Lores <silvi.caino@gmail.com> wrote:

Also, you should be limited by your container configuration at yarn-site.xml and mapred-site.xml,
check THIS to understand how resource management works.

Basically you can set the number of reducers you want but you are limited to the number the
system can actually hold by the configuration you have set.

Hope it helps.


On 16 January 2014 08:54, sudhakara st <sudhakara.st@gmail.com> wrote:

Hello Ashish,

Using “-D mapreduce.job.reduces=number” with fixed number of reducer will spawn that many
for a job.

On Thu, Jan 16, 2014 at 12:45 PM, Ashish Jain <ashjain2@gmail.com> wrote:

Dear All,

I have a 3 node cluster and have a map reduce job running on it. I have 8 data blocks spread
across all the 3 nodes. While running map reduce job I could see 8 map tasks running however
reduce job is only 1. Is there a way to configure multiple reduce jobs?






View raw message