hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang" <hair...@yahoo-inc.com>
Subject RE: Limit the space used by hadoop on a slave node
Date Tue, 08 Jan 2008 22:35:48 GMT
Joydeep,

Thanks for pointing out the problem. The cause of block size being 0 is
that block size is not past as a parameter in block transfer protocol.
So a Block object is initialized, we set its block size to be zero that
leads to a parameter of zero when getNextVolume is called. I will put
comment at HADOOP-2549 and see if we can mark it as a blocker to 0.16.

Hairong 

-----Original Message-----
From: Joydeep Sen Sarma [mailto:jssarma@facebook.com] 
Sent: Tuesday, January 08, 2008 2:21 PM
To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
Subject: RE: Limit the space used by hadoop on a slave node

can you please check the problem description in
https://issues.apache.org/jira/browse/HADOOP-2549 ?

i am not sure whether the bug u referred to fixes the problem. the issue
is that the getNextVolume() api in the dfs code is getting called with a
argument of 0 (for blocksize). as a result *every* volume becomes
eligible for block allocation. the logic is correct, the parameter is
wrong.

while i have no idea why the blocksize is being passed in as 0, i did
apply a patch to default the blocksize to 65M in case it comes in as
zero - and this patch is holding up. the space reservations are now
being honored.


-----Original Message-----
From: Hairong Kuang [mailto:hairong@yahoo-inc.com]
Sent: Tue 1/8/2008 2:16 PM
To: hadoop-user@lucene.apache.org
Subject: RE: Limit the space used by hadoop on a slave node
 
I agree that block distribution does not deal with heterogeneous cluster
well. Basically block replication does not favor less utilized datanode.
After 0.16 is released, you may periodically run the balancer to
redistribute blocks with the command bin/start-balancer.sh. 

I checked the datanode code. A datanode does check the amount of
available space before block allocation. I need to investigate the cause
of the disk full problem. I appreciate if you could provide me more
information like the capacity of the disk, the amount of dfs used space,
reserved space, and non-dfs used space when the out of disk problem
occurs.

Hairong

-----Original Message-----
From: Ted Dunning [mailto:tdunning@veoh.com]
Sent: Tuesday, January 08, 2008 1:37 PM
To: hadoop-user@lucene.apache.org
Subject: Re: Limit the space used by hadoop on a slave node


And I have both but have had disk full problems.  I can't be sure right
now whether this occurred under 14.4 or 15.1, but I think it was 15.1.

In any case, new file creation from a non-datanode host is definitely
not well balanced and will lead to disk full conditions if you have
dramatically different sized partitions available on the different
datanodes.  Also, if you have a small and a large partition available on
a single node, the small partition will fill up and cause corruption.  I
had to go to single partitions on all nodes to avoid this.

<property>
  <name>dfs.datanode.du.reserved</name>
  <!--  10 GB -->
  <value> 10000000000 </value>
  <description>Reserved space in bytes. Always leave this much space
free for non dfs use  </description> </property>

<property>
  <name>dfs.datanode.du.pct</name>
  <value>0.9f</value>
  <description>When calculating remaining space, only use this
percentage of the real available space
  </description>
</property>



On 1/8/08 1:30 PM, "Koji Noguchi" <knoguchi@yahoo-inc.com> wrote:

> We use,
> 
> dfs.datanode.du.pct for 0.14 and dfs.datanode.du.reserved for 0.15.
> 
> Change was made in the Jira Hairong mentioned.
> https://issues.apache.org/jira/browse/HADOOP-1463
> 
> Koji
> 
>> -----Original Message-----
>> From: Ted Dunning [mailto:tdunning@veoh.com]
>> Sent: Tuesday, January 08, 2008 1:13 PM
>> To: hadoop-user@lucene.apache.org
>> Subject: Re: Limit the space used by hadoop on a slave node
>> 
>> 
>> I think I have seen related bad behavior on 15.1.
>> 
>> On 1/8/08 11:49 AM, "Hairong Kuang" <hairong@yahoo-inc.com> wrote:
>> 
>>> Has anybody tried 15.0? Please check 
>>> https://issues.apache.org/jira/browse/HADOOP-1463.
>>> 
>>> Hairong
>>> -----Original Message-----
>>> From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
>>> Sent: Tuesday, January 08, 2008 11:33 AM
>>> To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
>>> Subject: RE: Limit the space used by hadoop on a slave node
>>> 
>>> at least up until 14.4, these options are broken. see
>>> https://issues.apache.org/jira/browse/HADOOP-2549
>>> 
>>> (there's a trivial patch - but i am still testing).
>>> 
>>> 
> 



Mime
View raw message