hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billy Pearson" <sa...@pearsonwholesale.com>
Subject hadoop balanceing data
Date Tue, 20 Jan 2009 06:28:27 GMT
Why do we not use the Remaining % in place of use Used % when we are 
selecting datanode for new data and when running the balancer.
form what I can tell we are using the use % used and we do not factor in non 
DFS Used at all.
I see a datanode with only a 60GB hard drive fill up completely 100% before 
the other servers that have 130+GB hard drives get half full.
Seams like Trying to keep the same % free on the drives in the cluster would 
be more optimal in production.
I know this still may not be perfect but would be nice if we tried.


View raw message