hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 李洪忠 <lhz...@hotmail.com>
Subject Re: disk used percentage is not symmetric on datanodes (balancer)
Date Tue, 19 Mar 2013 01:21:10 GMT
Maybe you need to modify the rackware script to make the rack balance, 
ie, all the racks are the same size,  on rack by 6 small nodes, one rack 
by 1 large nodes.
P.S.
you need to reboot the cluster for rackware script modify.

于 2013/3/19 7:17, Bertrand Dechoux 写道:
> And by active, it means that it does actually stops by itself? Else it 
> might mean that the throttling/limit might be an issue with regard to 
> the data volume or velocity.
>
> What threshold is used?
>
> About the small and big datanodes, how are they distributed with 
> regards to racks?
> About files, how is used the replication factor(s) and block size(s)?
>
> Surely trivial questions again.
>
> Bertrand
>
> On Mon, Mar 18, 2013 at 10:46 PM, Tapas Sarangi 
> <tapas.sarangi@gmail.com <mailto:tapas.sarangi@gmail.com>> wrote:
>
>     Hi,
>
>     Sorry about that, had it written, but thought it was obvious.
>     Yes, balancer is active and running on the namenode.
>
>     -Tapas
>
>     On Mar 18, 2013, at 4:43 PM, Bertrand Dechoux <dechouxb@gmail.com
>     <mailto:dechouxb@gmail.com>> wrote:
>
>>     Hi,
>>
>>     It is not explicitly said but did you use the balancer?
>>     http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#balancer
>>
>>     Regards
>>
>>     Bertrand
>>
>>     On Mon, Mar 18, 2013 at 10:01 PM, Tapas Sarangi
>>     <tapas.sarangi@gmail.com <mailto:tapas.sarangi@gmail.com>> wrote:
>>
>>         Hello,
>>
>>         I am using one of the old legacy version (0.20) of hadoop for
>>         our cluster. We have scheduled for an upgrade to the newer
>>         version within a couple of months, but I would like to
>>         understand a couple of things before moving towards the
>>         upgrade plan.
>>
>>         We have about 200 datanodes and some of them have larger
>>         storage than others. The storage for the datanodes varies
>>         between 12 TB to 72 TB.
>>
>>         We found that the disk-used percentage is not symmetric
>>         through all the datanodes. For larger storage nodes the
>>         percentage of disk-space used is much lower than that of
>>         other nodes with smaller storage space. In larger storage
>>         nodes the percentage of used disk space varies, but on
>>         average about 30-50%. For the smaller storage nodes this
>>         number is as high as 99.9%. Is this expected ? If so, then we
>>         are not using a lot of the disk space effectively. Is this
>>         solved in a future release ?
>>
>>         If no, I would like to know  if there are any checks/debugs
>>         that one can do to find an improvement with the current
>>         version or upgrading hadoop should solve this problem.
>>
>>         I am happy to provide additional information if needed.
>>
>>         Thanks for any help.
>>
>>         -Tapas
>>
>
>
>
>
> -- 
> Bertrand Dechoux 


Mime
View raw message