hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1751) Intrinsic limits for HDFS files, directories
Date Thu, 17 Mar 2011 21:38:29 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008157#comment-13008157
] 

Daryn Sharp commented on HDFS-1751:
-----------------------------------

(I'll avoid the design dispute, and explain why the implementation is the way it is)

The main quota method, updateCount(), is too low in the call stack and is designed to handle
disk and inode changes.  Allowing updateCount() to perform the limit checks will cause issues
because too many other operations call it, and updateCount() can't discern why it's been invoked.
 A few examples of issues that would occur:
1) adding or removing a block from a file will fail if the directory item limit has been reached
2) changing the replication factor on a pre-existing file that exceeds either of the limits
will fail
3) updating the disk quota counts via updateSpaceConsumed() will fail if either of the limits
are reached
...etc...

To address these issues, at a minimum, a boolean will need to be passed to updateCount() to
indicate if a filesystem directory update is occurring (ie. only addChild() will pass true).
 The new INode will also need to be passed to updateCount to check the component length. 
This would be a more complex change, that places an undue burden on the callers of updateCount()
to pass the right args, just to avoid having addChild perform the fs limit checks.

Please let me know if I'm overlooking anything in my analysis.

> Intrinsic limits for HDFS files, directories
> --------------------------------------------
>
>                 Key: HDFS-1751
>                 URL: https://issues.apache.org/jira/browse/HDFS-1751
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node
>    Affects Versions: 0.22.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>             Fix For: 0.23.0
>
>         Attachments: HDFS-1751-2.patch, HDFS-1751-3.patch, HDFS-1751-4.patch, HDFS-1751.patch
>
>
> Enforce a configurable limit on:
>   the length of a path component
>   the number of names in a directory
> The intention is to prevent a too-long name or a too-full directory. This is not about
RPC buffers, the length of command lines, etc. There may be good reasons for those kinds of
limits, but that is not the intended scope of this feature. Consequently, a reasonable implementation
might be to extend the existing quota checker so that it faults the creation of a name that
violates the limits. This strategy of faulting new creation evades the problem of existing
names or directories that violate the limits.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message