hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Steele <rmattste...@gmail.com>
Subject quotas for size of intermediate map/reduce output?
Date Wed, 21 Sep 2011 22:45:10 GMT
Hi All,

Is it possible to enforce a maximum to the disk space consumed by a
map/reduce job's intermediate output?  It looks like you can impose limits
on hdfs consumption, or, via the capacity scheduler, limits on the RAM that
a map/reduce slot uses, or the number of slots used.

But if I'm worried that a job might exhaust the cluster's disk capacity
during the shuffle, my sense is that I'd have to quarantine the job on a
separate cluster.  Am I wrong?  Do you have any suggestions for me?


View raw message