hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Beaudreault <bbeaudrea...@hubspot.com>
Subject Re: Force one mapper per machine (not core)?
Date Tue, 28 Jan 2014 22:06:47 GMT
If this cluster is being used exclusively for this goal, you could just set
the mapred.tasktracker.map.tasks.maximum to 1.

On Tue, Jan 28, 2014 at 5:00 PM, Keith Wiley <kwiley@keithwiley.com> wrote:

> I'm running a program which in the streaming layer automatically
> multithreads and does so by automatically detecting the number of cores on
> the machine.  I realize this model is somewhat in conflict with Hadoop, but
> nonetheless, that's what I'm doing.  Thus, for even resource utilization,
> it would be nice to not only assign one mapper per core, but only one
> mapper per machine.  I realize that if I saturate the cluster none of this
> really matters, but consider the following example for clarity: 4-core
> nodes, 10-node cluster, thus 40 slots, fully configured across mappers and
> reducers (40 slots of each).  Say I run this program with just two mappers.
>  It would run much more efficiently (in essentially half the time) if I
> could force the two mappers to go to slots on two separate machines instead
> of running the risk that Hadoop may assign them both to the same machine.
> Can this be done?
> Thanks.
> ________________________________________________________________________________
> Keith Wiley     kwiley@keithwiley.com     keithwiley.com
> music.keithwiley.com
> "Yet mark his perfect self-contentment, and hence learn his lesson, that
> to be
> self-contented is to be vile and ignorant, and that to aspire is better
> than to
> be blindly and impotently happy."
>                                            --  Edwin A. Abbott, Flatland
> ________________________________________________________________________________

View raw message