hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Алексей Бабутин <zorlaxpokemon...@gmail.com>
Subject Re: disk used percentage is not symmetric on datanodes (balancer)
Date Tue, 19 Mar 2013 10:00:16 GMT
node A=12TB
node B=72TB
How many A nodes  and B from 200 do you have?
If you have more B than A you can deactivate A,clear it and apply again.
I suppose that cluster about 3-5 Tb.Run balancer with threshold 0.2 or 0.1.

Different servers in one rack is bad idea.You should rebuild cluster with
multiple racks.

2013/3/19 Tapas Sarangi <tapas.sarangi@gmail.com>

> Hello,
> I am using one of the old legacy version (0.20) of hadoop for our cluster.
> We have scheduled for an upgrade to the newer version within a couple of
> months, but I would like to understand a couple of things before moving
> towards the upgrade plan.
> We have about 200 datanodes and some of them have larger storage than
> others. The storage for the datanodes varies between 12 TB to 72 TB.
> We found that the disk-used percentage is not symmetric through all the
> datanodes. For larger storage nodes the percentage of disk-space used is
> much lower than that of other nodes with smaller storage space. In larger
> storage nodes the percentage of used disk space varies, but on average
> about 30-50%. For the smaller storage nodes this number is as high as
> 99.9%. Is this expected ? If so, then we are not using a lot of the disk
> space effectively. Is this solved in a future release ?
> If no, I would like to know  if there are any checks/debugs that one can
> do to find an improvement with the current version or upgrading hadoop
> should solve this problem.
> I am happy to provide additional information if needed.
> Thanks for any help.
> -Tapas

View raw message