hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Wiley <kwi...@keithwiley.com>
Subject Re: Force one mapper per machine (not core)?
Date Tue, 28 Jan 2014 23:41:22 GMT
Yeah, it isn't, not even remotely, but thanks.

On Jan 28, 2014, at 14:06 , Bryan Beaudreault wrote:

> If this cluster is being used exclusively for this goal, you could just set the mapred.tasktracker.map.tasks.maximum
to 1.
> On Tue, Jan 28, 2014 at 5:00 PM, Keith Wiley <kwiley@keithwiley.com> wrote:
> I'm running a program which in the streaming layer automatically multithreads and does
so by automatically detecting the number of cores on the machine.  I realize this model is
somewhat in conflict with Hadoop, but nonetheless, that's what I'm doing.  Thus, for even
resource utilization, it would be nice to not only assign one mapper per core, but only one
mapper per machine.  I realize that if I saturate the cluster none of this really matters,
but consider the following example for clarity: 4-core nodes, 10-node cluster, thus 40 slots,
fully configured across mappers and reducers (40 slots of each).  Say I run this program with
just two mappers.  It would run much more efficiently (in essentially half the time) if I
could force the two mappers to go to slots on two separate machines instead of running the
risk that Hadoop may assign them both to the same machine.
> Can this be done?
> Thanks.

Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com

"Luminous beings are we, not this crude matter."
                                           --  Yoda

View raw message