hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hider, Sandy" <Sandy.Hi...@jhuapl.edu>
Subject Identification of mapper slots
Date Mon, 14 Oct 2013 21:49:30 GMT

In Hadoop under the mapred-site.conf  I can set the maximum number of mappers. For the sake
of this email I will call the number of concurrent mappers: mapper slots.

Is it possible to figure out from within the mapper which mapper slot it is running in?

On this project this is important because each mapper has to fork off a Matlab runtime compiled
executable.  The executable is passed in at runtime a cache to work in.  Setting up the cache
when given an new directory takes a long time but can be used again quickly on future calls
if provided the same location of the cache.   As it turns out when multiple mappers try to
use the same cache they crash the executable.   So ideally if I could identify which mapper
slot a mapper is running in, I can setup caches for each slot and avoid the cache creation
time and still guarantee that no two mappers write to the same cache.

Thanks for taking the time to read this,

Sandy



Mime
View raw message