hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Hoffman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1312) Re-balance disks within a Datanode
Date Tue, 25 Sep 2012 16:52:08 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462960#comment-13462960

Steve Hoffman commented on HDFS-1312:

The other thing is that as you grow a grid, you care less and less about the balance on individual
nodes. This issue is of primary important to smaller installations who likely are under-provisioned
hardware-wise anyway.
Our installation is about 1PB so I think we can say we are past "small".  We typically run
at 70-80% full as we are not made of money.  And at 90% the disk alarms start waking people
out of bed.
I would say we very much care about the balance of a single node.  When that node fills, it'll
take out the region server, the M/R jobs running on it and generally anger people who's jobs
have to be restarted.

I wouldn't be so quick to discount this.  And when you have enough machines, you are replacing
disks more and more frequently.  So ANY manual process is $ wasted in people time.  Time to
re-run jobs, times to take down datanode and move blocks.  Time = $.  To turn Hadoop into
a more mature product, shouldn't we be striving for "it just works"?
> Re-balance disks within a Datanode
> ----------------------------------
>                 Key: HDFS-1312
>                 URL: https://issues.apache.org/jira/browse/HDFS-1312
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node
>            Reporter: Travis Crawford
> Filing this issue in response to ``full disk woes`` on hdfs-user.
> Datanodes fill their storage directories unevenly, leading to situations where certain
disks are full while others are significantly less used. Users at many different sites have
experienced this issue, and HDFS administrators are taking steps like:
> - Manually rebalancing blocks in storage directories
> - Decomissioning nodes & later readding them
> There's a tradeoff between making use of all available spindles, and filling disks at
the sameish rate. Possible solutions include:
> - Weighting less-used disks heavier when placing new blocks on the datanode. In write-heavy
environments this will still make use of all spindles, equalizing disk use over time.
> - Rebalancing blocks locally. This would help equalize disk use as disks are added/replaced
in older cluster nodes.
> Datanodes should actively manage their local disk so operator intervention is not needed.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message