hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Wiley <kwi...@keithwiley.com>
Subject Re: Force one mapper per machine (not core)?
Date Fri, 31 Jan 2014 18:51:42 GMT
Hmmm, okay.  I know it's running CDH4 4.4.0, as but for whether it was specifically configured
with MR1 or MR2 (is there a distinction between MR2 and Yarn?) I'm not absolutely certain.
 I know that the cluster "behaves" like the MR1 clusters I've worked with for years (I interact
with the job tracker in a classical way for example).  Can I tell whether it's MR1 or MR2
from the job tracker or namename web UIs?


On Jan 29, 2014, at 00:52 , Harsh J wrote:

> Is your cluster running MR1 or MR2? On MR1, the CapacityScheduler
> would allow you to do this if you used appropriate memory based
> requests (see http://search-hadoop.com/m/gnFs91yIg1e), and on MR2
> (depending on the YARN scheduler resource request limits config) you
> can request your job be run with the maximum-most requests that would
> soak up all provided resources (of CPU and Memory) of a node such that
> only one container runs on a host at any given time.
> On Wed, Jan 29, 2014 at 3:30 AM, Keith Wiley <kwiley@keithwiley.com> wrote:
>> I'm running a program which in the streaming layer automatically multithreads and
does so by automatically detecting the number of cores on the machine.  I realize this model
is somewhat in conflict with Hadoop, but nonetheless, that's what I'm doing.  Thus, for even
resource utilization, it would be nice to not only assign one mapper per core, but only one
mapper per machine.  I realize that if I saturate the cluster none of this really matters,
but consider the following example for clarity: 4-core nodes, 10-node cluster, thus 40 slots,
fully configured across mappers and reducers (40 slots of each).  Say I run this program with
just two mappers.  It would run much more efficiently (in essentially half the time) if I
could force the two mappers to go to slots on two separate machines instead of running the
risk that Hadoop may assign them both to the same machine.
>> Can this be done?
>> Thanks.

Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com

"I used to be with it, but then they changed what it was.  Now, what I'm with
isn't it, and what's it seems weird and scary to me."
                                           --  Abe (Grandpa) Simpson

View raw message