hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sunil Govind <sunil.gov...@gmail.com>
Subject Re: YARN cluster underutilization
Date Wed, 25 May 2016 15:54:30 GMT
Hi Jeff,

It looks like to you are allocating more memory for AM container. Mostly
you might not need 6Gb (as per the log). Could you please help  to provide
some more information.

1. What type of mapreduce application (wordcount etc) are you running? Some
AMs may be CPU intensive and some may not be. So based on the type
application, memory/cpu can be tuned for better utilization.
2. How many mappers (reducers) are you trying to run here?
3. You have mentioned that each node has 8 cores and 15GB, but how much is
actually configured for NM?
4. Which scheduler are you using?
5. Its better to attach RM log if possible.

Thanks
Sunil

On Wed, May 25, 2016 at 8:58 PM Guttadauro, Jeff <jeff.guttadauro@here.com>
wrote:

> Hi, all.
>
>
>
> I have an M/R (map-only) job that I’m running on a Hadoop 2.7.1 YARN
> cluster that is being quite underutilized (utilization of around 25-30%).
> The EMR cluster is 1 master + 20 core m3.xlarge nodes, which have 8 cores
> each and 15G total memory (with 11.25G of that available to YARN).  I’ve
> configured mapper memory with the following properties, which should allow
> for 8 containers running map tasks per node:
>
>
>
> <property><name>mapreduce.map.memory.mb</name><value>1440</value></property>
> <!-- Container size -->
>
> <property><name>mapreduce.map.java.opts</name><value>-Xmx1024m</value></property>
> <!-- JVM arguments for a Map task -->
>
>
>
> It was suggested that perhaps my AppMaster was having trouble keeping up
> with creating all the mapper containers and that I bulk up its resource
> allocation.  So I did, as shown below, providing it 6G container memory (5G
> task memory), 3 cores, and 60 task listener threads.
>
>
>
> <property><name>yarn.app.mapreduce.am.job.task.listener.thread-count</name><value>60</value></property>
> <!-- App Master task listener threads -->
>
> <property><name>yarn.app.mapreduce.am.resource.cpu-vcores</name><value>3</value></property>
> <!-- App Master container vcores -->
>
> <property><name>yarn.app.mapreduce.am.resource.mb</name><value>6400</value></property>
> <!-- App Master container size -->
>
> <property><name>yarn.app.mapreduce.am.command-opts</name><value>-Xmx5120m</value></property>
> <!-- JVM arguments for each Application Master -->
>
>
>
> Taking a look at the node on which the AppMaster is running, I'm seeing
> plenty of CPU idle time and free memory, yet there are still nodes with no
> utilization (0 running containers).  The log indicates that the AppMaster
> has way more memory (physical/virtual) than it appears to need with
> repeated log messages like this:
>
>
>
> 2016-05-25 13:59:04,615 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl
> (Container Monitor): Memory usage of ProcessTree 11265 for container-id
> container_1464122327865_0002_01_000001: 1.6 GB of 6.3 GB physical memory
> used; 6.1 GB of 31.3 GB virtual memory used
>
>
>
> Can you please help me figure out where to go from here to troubleshoot,
> or any other things to try?
>
>
>
> Thanks!
>
> -Jeff
>
>
>

Mime
View raw message