hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (Naga)" <garlanaganarasi...@huawei.com>
Subject RE: Questions with regards to Yarn/Hadoop
Date Tue, 25 Aug 2015 06:09:08 GMT
Hi Omid,
Seems like the machine which was running slow might have the AM container also and possibly
2GB is assigned to it.
Can you share the following details :
* Memory configuration of AM container
* Containers which are running in the idling machine, is it always the same machine or different
everytime. are there any other processes running in that machine if always same
* Job counters for both the runs will also provide useful information please share across.

+ Naga

From: Omid Alipourfard [alipourf@usc.edu]
Sent: Tuesday, August 25, 2015 07:23
To: user@hadoop.apache.org
Subject: Questions with regards to Yarn/Hadoop


I am running a Terasort benchmark (10 GB, 25 reducers, 50 mappers) that comes with Hadoop
2.7.1.  I am experiencing an unexpected behavior with Yarn, which I am hoping someone can
shed some light on:

I have a cluster of three machines with 2 cores and 3.75 GB of RAM (per machine), when I run
the Terasort job, one of the machines is idling, i.e., it is not using any substantial Disk
or CPU.  All three machines are capable of executing jobs, and one of the machines is both
a name node and a data node.

On the other hand, running the same job on a cluster of three machines with 2 cores and 8
GB of RAM (per machine) utilizes all the machines.

Both setups are using the same Hadoop configuration files, in both of them mapper tasks have
1 GB and reducer tasks have 2 GB of memory.

I am guessing Yarn is not utilizing the machines correctly -- maybe because of the available
amount of RAM, but I am not sure how to verify this.

Any thoughts on what the problem might be or how to verify it is appreciated,

P.S. I can also post any of the logs or configuration files.

View raw message