hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Song <chen.song...@gmail.com>
Subject Re: hadoop cluster with non-uniform disk spec
Date Thu, 12 Feb 2015 15:33:07 GMT
*@Leo Leung*
Yes, dfs.datanode.data.dir is set correctly.

@Brahma Reddy Battula

Initially all the nodes we had were 5-disk nodes. Then we added a few racks
of 11-disk nodes. We are using CDH distribution and we set these settings
when we upgraded from CDH4 to CDH5.

To make it more clear, at this moment, all nodes (regardless of 5 disks or
11 disks) have roughly the same number of blocks, thus the same amount of
data stored. It seems data blocks are evenly distributed to the nodes
regardless of whether it is a 5-disk or 11-disk node. Is this expected
behavior?

The concern is that as more data coming in, the 5-disk nodes are reaching
to its configured capacity, while 11-disk nodes why below its capacity,
because the latter have more space collectively on each node.

I don't know if it is expected or my concern is valid?

Chen


On Thu, Feb 12, 2015 at 6:49 AM, Brahma Reddy Battula <
brahmareddy.battula@huawei.com> wrote:

>  Hello daemeon reiydelle
>
>
> Is the policy set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?
>
> >>Yes, you need to set this policy which will balance among the disks
>
> *@Chen Song*
>
> following settings controls what percentage of new block allocations will
> be sent to volumes with more available disk space than others
>
>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
> (20G)
>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
> = 0.85f
>
>
> Did you set while startup the cluster..?
>
>
>  Thanks & Regards
>
>  Brahma Reddy Battula
>
>
>
>
>   ------------------------------
> *From:* daemeon reiydelle [daemeonr@gmail.com]
> *Sent:* Thursday, February 12, 2015 12:02 PM
> *To:* user@hadoop.apache.org
> *Cc:* Ravi Prakash
> *Subject:* Re: hadoop cluster with non-uniform disk spec
>
>    What have you set dfs.datanode.fsdataset.volume.choosing.policy to
> (assuming you are on a current version of Hadoop)? Is the policy set to
> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?
>
>
>
>
> * ....... *
>
>
>
>
>
>
> *“Life should not be a journey to the grave with the intention of arriving
> safely in a pretty and well preserved body, but rather to skid in broadside
> in a cloud of smoke, thoroughly used up, totally worn out, and loudly
> proclaiming “Wow! What a Ride!” - Hunter Thompson Daemeon C.M. Reiydelle
> USA (+1) 415.501.0198 <%28%2B1%29%20415.501.0198> London (+44) (0) 20 8144
> 9872 <%28%2B44%29%20%280%29%2020%208144%209872>*
>
> On Wed, Feb 11, 2015 at 2:23 PM, Chen Song <chen.song.82@gmail.com> wrote:
>
>> Hey Ravi
>>
>>  Here are my settings:
>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480
>> (20G)
>>  dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>> = 0.85f
>>
>>  Chen
>>
>>
>> On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash <ravihoo@ymail.com> wrote:
>>
>>>  Hi Chen!
>>>
>>>  Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold
>>>
>>>
>>> dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction
>>> to?
>>>
>>>
>>>
>>>
>>>   On Wednesday, February 11, 2015 7:44 AM, Chen Song <
>>> chen.song.82@gmail.com> wrote:
>>>
>>>
>>>  We have a hadoop cluster consisting of 500 nodes. But the nodes are
>>> not uniform in term of disk spaces. Half of the racks are newer with 11
>>> volumes of 1.1T on each node, while the other half have 5 volume of 900GB
>>> on each node.
>>>
>>>  dfs.datanode.fsdataset.volume.choosing.policy is set to
>>> org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
>>>
>>>  It winds up with the state of half of nodes are full while the other
>>> half underutilized. I am wondering if there is a known solution for this
>>> problem.
>>>
>>>  Thank you for any suggestions.
>>>
>>>  --
>>> Chen Song
>>>
>>>
>>>
>>>
>>
>>
>>   --
>> Chen Song
>>
>>
>


-- 
Chen Song

Mime
View raw message