hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8538) Change the default volume choosing policy to AvailableSpaceVolumeChoosingPolicy
Date Fri, 05 Jun 2015 17:54:02 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574914#comment-14574914
] 

Andrew Wang commented on HDFS-8538:
-----------------------------------

Thanks guys for the discussion, some replies:

bq. You will get more complaints for performance degradation after the change. BTW, they should
set the policy themselves or run balancer.

I filed this because we've had tens of customers run into issues and not know about the AvailableSpace
policy. These issues include:

* Small disks filling up first and becoming unavailable for write, leading to poor performance
* Newly inserted disks having less data, leading to access skew
* Monitoring warnings from the disks being very full (90%+)
* Upgrade issues due to lack of free space on the very full disks

The normal balancer IIUC fixes inter-node balance but does not address intra-node balance.
There's been a intra-node balancer shell script floating around the internet for a while,
but I don't know if it's been updated for the new block-id based layout. It's also a hacky
approach we don't want to support in mainline, since it requires shutting down the DN and
manually moving blocks around.

My experience has been that users of heterogeneous sized disks almost always use this policy.
No users thus far have reported performance problems with the AvailableSpace policy. Harsh
actually recommended making it the default policy in the original JIRA, but we deferred to
let the code bake first.

Note also that heterogeneous sized disks are the rare case, most DNs are homogeneous. Since
AvailableSpace falls back to RR if the disks are mostly balanced, homogeneous DNs should be
unaffected.

Related, there's also been user demand for an available space block placement policy, leading
to the recent implementation of the HDFS-8131. Balancer

bq. If that's correct it's still possible that just one or a small number of volumes would
fall into the higher bucket and get overloaded.

This leads me to an potential enhancement: count the # of outstanding writes to a low-capacity
disk, and exclude it from skewed placement if it's got too many outstanding writes. This would
be even better if we used OS-level IO statistics, but that could be a follow-on.

Nicholas + Arpit, would the above satisfy your concerns about disk overload? It also might
be a good opportunity to do the relative free space enhancement recommended by Chris N.

> Change the default volume choosing policy to AvailableSpaceVolumeChoosingPolicy
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-8538
>                 URL: https://issues.apache.org/jira/browse/HDFS-8538
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.7.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: hdfs-8538.001.patch
>
>
> For datanodes with different sized disks, they almost always want the available space
policy. Users with homogenous disks are unaffected.
> Since this code has baked for a while, let's change it to be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message