hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo Nicholas Sze (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12130) Optimizing permission check for getContentSummary
Date Thu, 13 Jul 2017 22:23:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086520#comment-16086520

Tsz Wo Nicholas Sze commented on HDFS-12130:

The patch looks good.  One problem is that inode.getPathComponents() is called for each inode
in the subtree and getPathComponents() is expensive since it builds the full path.  We should
avoid calling it for each inode.

> Optimizing permission check for getContentSummary
> -------------------------------------------------
>                 Key: HDFS-12130
>                 URL: https://issues.apache.org/jira/browse/HDFS-12130
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>         Attachments: HDFS-12130.001.patch, HDFS-12130.002.patch
> Currently, {{getContentSummary}} takes two phases to complete:
> - phase1. check the permission of the entire subtree. If any subdirectory does not have
{{READ_EXECUTE}}, an access control exception is thrown and {{getContentSummary}} terminates
here (unless it's super user).
> - phase2. If phase1 passed, it will then traverse the entire tree recursively to get
the actual content summary.
> An issue is, both phases currently hold the fs lock.
> Phase 2 has already been written that, it will yield the fs lock over time, such that
it does not block other operations for too long. However phase 1 does not yield. Meaning it's
possible that the permission check phase still blocks things for long time.
> One fix is to add lock yield to phase 1. But a simpler fix is to merge phase 1 into phase
2. Namely, instead of doing a full traversal for permission check first, we start with phase
2 directly, but for each directory, before obtaining its summary, check its permission first.
This way we take advantage of existing lock yield in phase 2 code and still able to check
permission and terminate on access exception.
> Thanks [~szetszwo] for the offline discussions!

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message