hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guttadauro, Jeff" <jeff.guttada...@here.com>
Subject RE: YARN cluster underutilization
Date Wed, 25 May 2016 16:29:29 GMT
Thanks for your thoughts thus far, Sunil.  Most grateful for any additional help you or others
can offer.  To answer your questions,

1.       This is a custom M/R job, which uses mappers only (no reduce phase) to process GPS
probe data and filter based on inclusion within a provided polygon.  There is actually a lot
of upfront work done in the driver to make that task as simple as can be (identifies a list
of tiles that are completely inside the polygon and those that fall across an edge, for which
more processing would be needed), but the job would still be more compute-intensive than wordcount,
for example.

2.       I’m running almost 84k mappers for this job.  This is actually down from ~600k
mappers, since one other thing I’ve done is increased the mapreduce.input.fileinputformat.split.minsize
to 536870912 (512M) for the job.  Data is in S3, so loss of locality isn’t really a concern.

3.       For NodeManager configuration, I’m using EMR’s default configuration for the
m3.xlarge instance type, which is yarn.scheduler.minimum-allocation-mb=32, yarn.scheduler.maximum-allocation-mb=11520,
and yarn.nodemanager.resource.memory-mb=11520.  YARN dashboard shows min/max allocations of
<memory:32, vCores:1>/<memory:11520, vCores:8>.

4.       Capacity Scheduler [MEMORY]

5.       I’ve attached 2500 lines from the RM log.  Happy to grab more, but they are pretty
big, and I thought that might be sufficient.

Any guidance is much appreciated!

From: Sunil Govind [mailto:sunil.govind@gmail.com]
Sent: Wednesday, May 25, 2016 10:55 AM
To: Guttadauro, Jeff <jeff.guttadauro@here.com>; user@hadoop.apache.org
Subject: Re: YARN cluster underutilization

Hi Jeff,

It looks like to you are allocating more memory for AM container. Mostly you might not need
6Gb (as per the log). Could you please help  to provide some more information.

1. What type of mapreduce application (wordcount etc) are you running? Some AMs may be CPU
intensive and some may not be. So based on the type application, memory/cpu can be tuned for
better utilization.
2. How many mappers (reducers) are you trying to run here?
3. You have mentioned that each node has 8 cores and 15GB, but how much is actually configured
for NM?
4. Which scheduler are you using?
5. Its better to attach RM log if possible.


On Wed, May 25, 2016 at 8:58 PM Guttadauro, Jeff <jeff.guttadauro@here.com<mailto:jeff.guttadauro@here.com>>
Hi, all.

I have an M/R (map-only) job that I’m running on a Hadoop 2.7.1 YARN cluster that is being
quite underutilized (utilization of around 25-30%).  The EMR cluster is 1 master + 20 core
m3.xlarge nodes, which have 8 cores each and 15G total memory (with 11.25G of that available
to YARN).  I’ve configured mapper memory with the following properties, which should allow
for 8 containers running map tasks per node:

  <!-- Container size -->
 <!-- JVM arguments for a Map task -->

It was suggested that perhaps my AppMaster was having trouble keeping up with creating all
the mapper containers and that I bulk up its resource allocation.  So I did, as shown below,
providing it 6G container memory (5G task memory), 3 cores, and 60 task listener threads.

 <!-- App Master task listener threads -->
 <!-- App Master container vcores -->
 <!-- App Master container size -->
 <!-- JVM arguments for each Application Master -->

Taking a look at the node on which the AppMaster is running, I'm seeing plenty of CPU idle
time and free memory, yet there are still nodes with no utilization (0 running containers).
 The log indicates that the AppMaster has way more memory (physical/virtual) than it appears
to need with repeated log messages like this:

2016-05-25 13:59:04,615 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl
(Container Monitor): Memory usage of ProcessTree 11265 for container-id container_1464122327865_0002_01_000001:
1.6 GB of 6.3 GB physical memory used; 6.1 GB of 31.3 GB virtual memory used

Can you please help me figure out where to go from here to troubleshoot, or any other things
to try?


View raw message