hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinglun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14562) The behaviour of getContentSummaryInt() in getQuotaUsage() should be configurable.
Date Fri, 14 Jun 2019 03:50:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16863664#comment-16863664

Jinglun commented on HDFS-14562:

Hi [~hexiaoqiao], good idea. I'll start a parent issue for all my quota issues.

> The behaviour of getContentSummaryInt() in getQuotaUsage() should be configurable.
> ----------------------------------------------------------------------------------
>                 Key: HDFS-14562
>                 URL: https://issues.apache.org/jira/browse/HDFS-14562
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.1.0
>            Reporter: Jinglun
>            Assignee: Jinglun
>            Priority: Major
>         Attachments: HDFS-14562.001.patch
> Our XiaoMi HDFS is considering upgrading from 2.6 to 3.1. There is a ploblem about the
getQuotaUsage rpc. In FSDirStatAndListingOp.getQuotaUsage(), if there isn't any quota on the
dir, it will automatically count the dir to get the info of usage. But count on big dirs are
quit dangerous, it can slow the NameNode and even cause a failover. We've encountered the
case that 10 concurrent count rpcs on big dir causes the NameNode failover.
> In our cluster we always need to check whether the dir has got quota or not, and the
automatically count will make things dangerous. Making the behavior configurable seems a good
idea. Administrator can decide to fall back to count or fill the consume with -1 when there
is no quota on the dir.
> When I try to make it configurable, I find another problem. When we convert QuotaUsageProto
and QuotaUsage in PBHelperClient.class, there are checks for qu.hasTypeQuotaInfos() and qu.isTypeQuotaSet()
|| qu.isTypeConsumedAvailable(). Supposing we want to return a QuotaUsage with \{fileAndDirectoryCount=-1,
spaceConsumed=-1, typeConsumed={-1,-1,-1,-1,-1}} from Namenode to Client, because of the check,
the value got by Client will be \{fileAndDirectoryCount=-1, spaceConsumed=-1, typeConsumed={0,0,0,0,0}}.
It's inconsistent and I can't see any good reason that spaceConsumed could return -1 while
typeConsumed must be 0. In fact we don't need the checks, checking all the assignment statement
then we'll find that QuotaUsage.typeConsumed and typeQuota will never be null. And it's not
right for the Convert layer to tamper the returned value. Since -1 represents undefined in
quota and usage, we should remove the check and let Namenode returns -1.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message