hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <ahluwal...@yahoo.com>
Subject RE: Does Hadoop Honor Reserved Space?
Date Thu, 06 Mar 2008 18:14:23 GMT
I've run into a similar issue in the past. From what I understand, this
parameter only controls the HDFS space usage. However, the intermediate data in
the map reduce job is stored on the local file system (not HDFS) and is not
subject to this configuration.

In the past I have used mapred.local.dir.minspacekill and
mapred.local.dir.minspacestart to control the amount of space that is allowable
for use by this temporary data. 

Not sure if that is the best approach though, so I'd love to hear what other
people have done. In your case, you have a map-red job that will consume too
much space (without setting a limit, you didn't have enough disk capacity for
the job), so looking at mapred.output.compress and mapred.compress.map.output
might be useful to decrease the job's disk requirements.

--Ash

-----Original Message-----
From: Jimmy Wan [mailto:jimmy@indeed.com] 
Sent: Thursday, March 06, 2008 9:56 AM
To: core-user@hadoop.apache.org
Subject: Does Hadoop Honor Reserved Space?

I've got 2 datanodes setup with the following configuration parameter:
	<property>
	  <name>dfs.datanode.du.reserved</name>
	  <value>429496729600</value>
	  <description>Reserved space in bytes per volume. Always leave this
much  
space free for non dfs use.
	  </description>
	</property>

Both are housed on 800GB volumes, so I thought this would keep about half  
the volume free for non-HDFS usage.

After some long running jobs last night, both disk volumes were completely  
filled. The bulk of the data was in:
${my.hadoop.tmp.dir}/hadoop-hadoop/dfs/data

This is running as the user hadoop.

Am I interpretting these parameters incorrectly?

I noticed this issue, but it is marked as closed:  
http://issues.apache.org/jira/browse/HADOOP-2549

-- 
Jimmy


Mime
View raw message