hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3570) Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used space
Date Thu, 06 Feb 2014 10:46:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893252#comment-13893252

Hadoop QA commented on HDFS-3570:

{color:green}+1 overall{color}.  Here are the results of testing the latest attachment 
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 4 new or modified
test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version
1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-common-project/hadoop-common

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6052//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6052//console

This message is automatically generated.

> Balancer shouldn't rely on "DFS Space Used %" as that ignores non-DFS used space
> --------------------------------------------------------------------------------
>                 Key: HDFS-3570
>                 URL: https://issues.apache.org/jira/browse/HDFS-3570
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Assignee: Akira AJISAKA
>            Priority: Minor
>         Attachments: HDFS-3570.2.patch, HDFS-3570.aash.1.patch
> Report from a user here: https://groups.google.com/a/cloudera.org/d/msg/cdh-user/pIhNyDVxdVY/b7ENZmEvBjIJ,
post archived at http://pastebin.com/eVFkk0A0
> This user had a specific DN that had a large non-DFS usage among dfs.data.dirs, and very
little DFS usage (which is computed against total possible capacity). 
> Balancer apparently only looks at the usage, and ignores to consider that non-DFS usage
may also be high on a DN/cluster. Hence, it thinks that if a DFS Usage report from DN is 8%
only, its got a lot of free space to write more blocks, when that isn't true as shown by the
case of this user. It went on scheduling writes to the DN to balance it out, but the DN simply
can't accept any more blocks as a result of its disks' state.
> I think it would be better if we _computed_ the actual utilization based on {{(100-(actual
remaining space))/(capacity)}}, as opposed to the current {{(dfs used)/(capacity)}}. Thoughts?
> This isn't very critical, however, cause it is very rare to see DN space being used for
non DN data, but it does expose a valid bug.

This message was sent by Atlassian JIRA

View raw message