hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Moores <mmoo...@real.com>
Subject Limiting concurrent maps
Date Wed, 20 Oct 2010 22:41:31 GMT
I have been playing with mapreduce.tasktracker.map.tasks.maximum to reduce the load
on my Cassandra cluster (using the Cassandra ColumnFamilyInputFormat).  I'd like to find ways
of throttling the map operations
in the case I may be affecting OLTP activity on the cluster.

What parameters can I use to limit the number of map tasks running concurrently across the
whole cluster?  mapreduce.tasktracker.map.tasks.maximum 
limits the number of concurrent maps per task tracker.  But can i do this at the job level?

Should I look at the "fair" scheduler?

View raw message