hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sudhakara st <sudhakara...@gmail.com>
Subject Re: Doubts: Deployment and Configuration of YARN cluster
Date Wed, 15 Jan 2014 13:48:20 GMT
Hello Nirmal,

    No specific config file changes required for slave node. For
{dfs.datanode.name.dir}
variable changes also not require if have same kind mount point in all
slave. If mount are different then you have edit this variable in specific
to slave node. Running heterogeneous hardware among the slave nodes is not
recommended, sure that it has lot impact on when your running MR jobs in
Hadoop.1. I am have not much clear on how it works in Resouce manager in
Hadoop.2.
      Diffirent values {mapreduce.map.memory.mb}  and
{mapreduce.reduce.memory.mb}  is going to create long tail problem,
inefficient usage resources, starvation in the cluster.  Changes in the
{mapreduce.reduce.java.opts} and {mapreduce.map.java.opts}  going to impact
less but chance of task failure is more when your job are I/O intensive and
you allocated less memory and allocation of memory leads memory is
allocated but not used, not available for required.


On Wed, Jan 15, 2014 at 6:51 PM, Nirmal Kumar <nirmal.kumar@impetus.co.in>wrote:

>  All,
>
>
>
> I am new to YARN and have certain doubts regarding the deployment and
> configuration of YARN on a cluster.
>
>
>
> As per my understanding to deploy Hadoop 2.x using YARN on a cluster we
> need to distribute the below files to all the slave nodes in the cluster:
>
> ·         conf/core-site.xml
>
> ·         conf/hdfs-site.xml
>
> ·         conf/yarn-site.xml
>
> ·         conf/mapred-site.xml
>
>
>
> Also we need to ONLY change the following file on each slave nodes:
>
> ·         conf/hdfs-site.xml
>
> Need to mention the {dfs.datanode.name.dir} value
>
>
>
> Do we need to change any other config file on the slave nodes?
>
> Can I change {yarn.nodemanager.resource.memory-mb} for each NM running on
> the slave nodes?
>
> This is since I might have a **heterogeneous environment** i.e. different
> nodes with different memory and cores. For NM1 I might have 40GB memory and
> for the other say 20GB.
>
>
>
> Also,
>
> {mapreduce.map.memory.mb}   specifies the **max. virtual memory** allowed
> by a Hadoop task subprocess.
>
> {mapreduce.map.java.opts}         specify the **max. heap space** of the
> allocated jvm. If you exceed the max heap size, the JVM throws an OOM.
>
> {mapreduce.reduce.memory.mb}
>
> {mapreduce.reduce.java.opts}
>
> are the above properties applicable to all the Map\Reduce tasks(from
> different Map Reduce applications) in general, running on different slave
> nodes?
>
> or Can I change these for a particular slave node.? For e.g. say for a
> SlaveNode1 I run the map task with 4GB and for other SlaveNode2 I run the
> map task with 8GB. Same with the reduce task.
>
>
>
> I need some understanding to **configure processing capacity** in the
> cluster like *Container Size, No. of Containers, No. of Mappers\Reducers*.
>
>
>
>
> Thanks,
>
> -Nirmal
>
> ------------------------------
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.
>



-- 

Regards,
...Sudhakara.st

Mime
View raw message