hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nirmal Kumar <nirmal.ku...@impetus.co.in>
Subject RE: Doubts: Deployment and Configuration of YARN cluster
Date Thu, 16 Jan 2014 07:03:01 GMT
Hi German,

I went through the links for memory configuration settings/best-practices.
It considers the cluster to be homogenous i.e. same RAM size in all the nodes.

Also on the Yarn whitepaper(Section 3.2 Page 6) I see:
This resource model serves current applications well
in homogeneous environments, but we expect it to
evolve over time as the ecosystem matures and new requirements
emerge.

Does that mean in YARN in order to configure processing capacity like Container Size, No.
of Containers, No. of Mappers\Reducers the cluster has to be homogenous?
How about if I have a *heterogeneous cluster* with varying RAM, disks , cores?

Thanks,
-Nirmal

From: Nirmal Kumar
Sent: Wednesday, January 15, 2014 8:22 PM
To: user@hadoop.apache.org
Subject: RE: Doubts: Deployment and Configuration of YARN cluster

Thanks a lot German.

Will go through the links and see if that answers my questions\doubts.

-Nirmal

From: German Florez-Larrahondo [mailto:german.fl@samsung.com]
Sent: Wednesday, January 15, 2014 7:20 PM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: RE: Doubts: Deployment and Configuration of YARN cluster

Nirmal

-A good summary regarding memory configuration settings/best-practices can be found here.
Note that in YARN, the way you configure resource limits dictates number of containers in
the nodes and in the cluster:
http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html

-A good intro to YARN configuration is this:
http://www.thecloudavenue.com/2012/01/getting-started-with-nextgen-mapreduce_11.html

Regards
.g



From: Nirmal Kumar [mailto:nirmal.kumar@impetus.co.in]
Sent: Wednesday, January 15, 2014 7:22 AM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Doubts: Deployment and Configuration of YARN cluster

All,

I am new to YARN and have certain doubts regarding the deployment and configuration of YARN
on a cluster.

As per my understanding to deploy Hadoop 2.x using YARN on a cluster we need to distribute
the below files to all the slave nodes in the cluster:

*         conf/core-site.xml

*         conf/hdfs-site.xml

*         conf/yarn-site.xml

*         conf/mapred-site.xml

Also we need to ONLY change the following file on each slave nodes:

*         conf/hdfs-site.xml
Need to mention the {dfs.datanode.name.dir} value

Do we need to change any other config file on the slave nodes?
Can I change {yarn.nodemanager.resource.memory-mb} for each NM running on the slave nodes?
This is since I might have a *heterogeneous environment* i.e. different nodes with different
memory and cores. For NM1 I might have 40GB memory and for the other say 20GB.

Also,
{mapreduce.map.memory.mb}   specifies the *max. virtual memory* allowed by a Hadoop task subprocess.
{mapreduce.map.java.opts}         specify the *max. heap space* of the allocated jvm. If you
exceed the max heap size, the JVM throws an OOM.
{mapreduce.reduce.memory.mb}
{mapreduce.reduce.java.opts}
are the above properties applicable to all the Map\Reduce tasks(from different Map Reduce
applications) in general, running on different slave nodes?
or Can I change these for a particular slave node.? For e.g. say for a SlaveNode1 I run the
map task with 4GB and for other SlaveNode2 I run the map task with 8GB. Same with the reduce
task.

I need some understanding to *configure processing capacity* in the cluster like Container
Size, No. of Containers, No. of Mappers\Reducers.

Thanks,
-Nirmal

________________________________






NOTE: This message may contain information that is confidential, proprietary, privileged or
otherwise protected by law. The message is intended solely for the named addressee. If received
in error, please destroy and notify the sender. Any use of this email is prohibited when received
in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors, virus, interception
or interference.

________________________________






NOTE: This message may contain information that is confidential, proprietary, privileged or
otherwise protected by law. The message is intended solely for the named addressee. If received
in error, please destroy and notify the sender. Any use of this email is prohibited when received
in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors, virus, interception
or interference.

________________________________






NOTE: This message may contain information that is confidential, proprietary, privileged or
otherwise protected by law. The message is intended solely for the named addressee. If received
in error, please destroy and notify the sender. Any use of this email is prohibited when received
in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors, virus, interception
or interference.

Mime
View raw message