hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@apache.org>
Subject Re: distributed cache exceeding local.cache.size
Date Thu, 31 Mar 2011 22:25:40 GMT

On Mar 31, 2011, at 11:45 AM, Travis Crawford wrote:

> Is anyone familiar with how the distributed cache deals when datasets
> larger than the total cache size are referenced? I've disabled the job
> that caused this situation but am wondering if I can configure things
> more defensively.

	I've started building specific file systems on drives to store the map reduce spill space.
 It seems to be the only reliable way to prevent MR from going nuts.  Sure, some jobs may
fail, but that seems to be a better strategy than the alternative.

View raw message