hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Parks" <davidpark...@yahoo.com>
Subject What does mapred.map.tasksperslot do?
Date Thu, 27 Dec 2012 08:21:18 GMT
I didn't come up with much in a google search.


In particular, what are the side effects of changing this setting? Memory?
Sort process?


I'm guessing it means that it'll feed 2 map tasks as input to each map task,
a map task in turn is a self-contained JVM which consumes one map slot.


Thus 4 map slots, and 2 tasksperslot means 4 map task JVMs each of which
process 2 input splits at a time.


By increasing the tasksperslot I presume we reduce overhead needed to start
a new task (even though we're re-using the JVM in typical configuration,
ours included), but we have more map output to sort and shuffle (I presume
the results of both map splits go into the same output).


Can someone verify those presumptions?

View raw message