hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yiqun Lin (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-12934) RBF: Federation supports global quota
Date Sat, 06 Jan 2018 16:17:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314725#comment-16314725
] 

Yiqun Lin edited comment on HDFS-12934 at 1/6/18 4:16 PM:
----------------------------------------------------------

Thanks for the review, Íñigo. Most of the comments make sense to me except following review
comments:

{quote}
Refactor getQuotaUsage(String path) to do the lastIndexOf and substring sequence only once.
Basically compose the if that does the check in the cache and the new isQuotaSet()
{quote}
The way you mentioned seems not what I am desired for. For example, three mount tables:
{noformat}
/path  <-- quota set
/path/subpath <---- quota set
/path/subpath/subpath <----no quota set
{noformat}
Now we plan to create a file {{file}} under path {{/path/subpath/subpath/}}, so the path value
{{{{/path/subpath/subpath/file}}}} will be passed. By above way, it will do a substring operation
and directly get the quota of {{/path/subpath/subpath}} which quota was not set. But actually
its ancestor path has quota set. So the way I am designed for is that we will get the nearest
ancestor's quota usage which quota was set (quota for /path/subpath should be found). 

{quote}
When merging the quotas in getQuotaUsage(), a single subcluster with quota unset removes the
whole thing for everybody? Should we sum the individual quotas?
{quote}
I don't fully understanding on this. When merging the quotas, the quota usage info which we
queried from an quota unset subcluster won't be empty. Only the quota value is -1, and the
usage can be summed.

{quote}
Not sure if we can fully reuse DirectoryWithQuotaFeature.
{quote}
I didn't find a direct way to reuse this. DirectoryWithQuotaFeature is got from INode, we
don't do any path-INode transform work. But I have updated the check as DirectoryWithQuotaFeature#verifyNamespaceQuota()
does.

Other comments are all addressed. Attach the updated patch. The new patch will look more readable.



was (Author: linyiqun):
Thanks for the review, Íñigo. Most of the comments make sense to me except following review
comments:

{quote}
Refactor getQuotaUsage(String path) to do the lastIndexOf and substring sequence only once.
Basically compose the if that does the check in the cache and the new isQuotaSet()
{quote}
The way you mentioned seems not what I am desired for. For example, three mount tables:
{noformat}
/path  <-- quota set
/path/subpath <---- quota set
/path/subpath/subpath <----no quota set
{noformat}
Now we plan to create a file {{file}} under path {{/path/subpath/subpath/}}, so the path value
{{{{/path/subpath/subpath/file}}}} will be passed. By above way, it will do a substring operation
and directly get the quota of {{/path/subpath/subpath}} which quota was not set. But actually
its ancestor path has quota set. So the way I am designed for is that we will get the nearest
ancestor's quota usage which quota was set. 

{quote}
When merging the quotas in getQuotaUsage(), a single subcluster with quota unset removes the
whole thing for everybody? Should we sum the individual quotas?
{quote}
I don't fully understanding on this. When merging the quotas, the quota usage info which we
queried from an quota unset subcluster won't be empty. Only the quota value is -1, and the
usage can be summed.

{quote}
Not sure if we can fully reuse DirectoryWithQuotaFeature.
{quote}
I didn't find a direct way to reuse this. DirectoryWithQuotaFeature is got from INode, we
don't do any path-INode transform work. But I have updated the check as DirectoryWithQuotaFeature#verifyNamespaceQuota()
does.

Other comments are all addressed. Attach the updated patch. The new patch will look more readable.


> RBF: Federation supports global quota
> -------------------------------------
>
>                 Key: HDFS-12934
>                 URL: https://issues.apache.org/jira/browse/HDFS-12934
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>              Labels: RBF
>         Attachments: HDFS-12934.001.patch, HDFS-12934.002.patch, HDFS-12934.003.patch,
HDFS-12934.004.patch, HDFS-12934.005.patch, RBF support  global quota.pdf
>
>
> Now federation doesn't support set the global quota for each folder. Currently the quota
will be applied for each subcluster under the specified folder via RPC call.
> It will be very useful for users that federation can support setting global quota and
exposing the command of this.
> In a federated environment, a folder can be spread across multiple subclusters. For this
reason, we plan to solve this by following way:
> # Set global quota across each subcluster. We don't allow each subcluster can exceed
maximun quota value.
> # We need to construct one <Path, QuotaUsage> cache map for storing the sum  quota
usage of these subclusters under federation folder. Every time we want to do WRITE operation
under specified folder, we will get its quota usage from cache and verify its quota. If quota
exceeded, throw exception, otherwise update its quota usage in cache when finishing operations.
> The quota will be set to mount table and as a new field in mount table. The set/unset
command will be like:
> {noformat}
>  hdfs dfsrouteradmin -setQuota -ns <nsQuota> -ss <ssQuota> <mount table>
>  hdfs dfsrouteradmin -clrQuota  <mount table>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message