hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-8898) Create API and command-line argument to get quota without need to get file and directory counts
Date Wed, 16 Sep 2015 18:05:46 GMT

     [ https://issues.apache.org/jira/browse/HDFS-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Ming Ma updated HDFS-8898:
    Attachment: HDFS-8898.patch

Here is the patch just to illustrate the idea. Applications have the option to get {{QuotaUsage}}
for any directory that has quota set. It contains the quota and the usage. This allows NN
to directly use the cached data.

For a regular user, NN's recursive file permission check still takes time, but at least getting
the actual usage is fast. So the overall latency of getting quota usage is faster than getContentSummary.
For a super user, given there is no more traversal so it will just take few milliseconds for
any large directory. 

> Create API and command-line argument to get quota without need to get file and directory
> -----------------------------------------------------------------------------------------------
>                 Key: HDFS-8898
>                 URL: https://issues.apache.org/jira/browse/HDFS-8898
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: fs
>            Reporter: Joep Rottinghuis
>         Attachments: HDFS-8898.patch
> On large directory structures it takes significant time to iterate through the file and
directory counts recursively to get a complete ContentSummary.
> When you want to just check for the quota on a higher level directory it would be good
to have an option to skip the file and directory counts.
> Moreover, currently one can only check the quota if you have access to all the directories
underneath. For example, if I have a large home directory under /user/joep and I host some
files for another user in a sub-directory, the moment they create an unreadable sub-directory
under my home I can no longer check what my quota is. Understood that I cannot check the current
file counts unless I can iterate through all the usage, but for administrative purposes it
is nice to be able to get the current quota setting on a directory without the need to iterate
through and run into permission issues on sub-directories.

This message was sent by Atlassian JIRA

View raw message