hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen O'Donnell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-13728) Disk Balancer should not fail if volume usage is greater than capacity
Date Fri, 03 Aug 2018 14:37:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568264#comment-16568264
] 

Stephen O'Donnell commented on HDFS-13728:
------------------------------------------

I think its ready for review now. Please have a look and I can own this one if you want.

> Disk Balancer should not fail if volume usage is greater than capacity
> ----------------------------------------------------------------------
>
>                 Key: HDFS-13728
>                 URL: https://issues.apache.org/jira/browse/HDFS-13728
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: diskbalancer
>    Affects Versions: 3.0.3
>            Reporter: Stephen O'Donnell
>            Assignee: Gabor Bota
>            Priority: Minor
>         Attachments: HDFS-13728.001.patch
>
>
> We have seen a couple of scenarios where the disk balancer fails because a datanode reports
more spaced used on a disk than its capacity, which should not be possible.
> This is due to the check below in DiskBalancerVolume.java:
> {code}
>   public void setUsed(long dfsUsedSpace) {
>     Preconditions.checkArgument(dfsUsedSpace < this.getCapacity(),
>         "DiskBalancerVolume.setUsed: dfsUsedSpace(%s) < capacity(%s)",
>         dfsUsedSpace, getCapacity());
>     this.used = dfsUsedSpace;
>   }
> {code}
> While I agree that it should not be possible for a DN to report more usage on a volume
than its capacity, there seems to be some issue that causes this to occur sometimes.
> In general, this full disk is what causes someone to want to run the Disk Balancer, only
to find it fails with the error.
> There appears to be nothing you can do to force the Disk Balancer to run at this point,
but in the scenarios I saw, some data was removed from the disk and usage dropped below the
capacity resolving the issue.
> Can we considered relaxing the above check, and if the usage is greater than the capacity,
just set the usage to the capacity so the calculations all work ok?
> Eg something like this:
> {code}
>    public void setUsed(long dfsUsedSpace) {
> -    Preconditions.checkArgument(dfsUsedSpace < this.getCapacity());
> -    this.used = dfsUsedSpace;
> +    if (dfsUsedSpace > this.getCapacity()) {
> +      this.used = this.getCapacity();
> +    } else {
> +      this.used = dfsUsedSpace;
> +    }
>    }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message