hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Wiley <kwi...@keithwiley.com>
Subject Force one mapper per machine (not core)?
Date Tue, 28 Jan 2014 22:00:22 GMT
I'm running a program which in the streaming layer automatically multithreads and does so by
automatically detecting the number of cores on the machine.  I realize this model is somewhat
in conflict with Hadoop, but nonetheless, that's what I'm doing.  Thus, for even resource
utilization, it would be nice to not only assign one mapper per core, but only one mapper
per machine.  I realize that if I saturate the cluster none of this really matters, but consider
the following example for clarity: 4-core nodes, 10-node cluster, thus 40 slots, fully configured
across mappers and reducers (40 slots of each).  Say I run this program with just two mappers.
 It would run much more efficiently (in essentially half the time) if I could force the two
mappers to go to slots on two separate machines instead of running the risk that Hadoop may
assign them both to the same machine.

Can this be done?


Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com

"Yet mark his perfect self-contentment, and hence learn his lesson, that to be
self-contented is to be vile and ignorant, and that to aspire is better than to
be blindly and impotently happy."
                                           --  Edwin A. Abbott, Flatland

View raw message