hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1751) Intrinsic limits for HDFS files, directories
Date Thu, 17 Mar 2011 20:52:29 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008122#comment-13008122
] 

Allen Wittenauer commented on HDFS-1751:
----------------------------------------

> This feature was suggested after a couple of incidents where user applications exhausted

> some resource by behaving in a way that was deeply wrong and (probably) unintended. 
> Can HDFS fault bad jobs cheaply?

Artificial limits such as these are very simple to defeat though.  Never underestimate a determined
user.  I can easily see the end result being that the user will just create X directories,
and then create Y files under those directories using a hash structure. Whatever resource
was being exhausted will likely continue to be exhausted. 

(The only resource I can imagine being a problem is if a job has been given so many files
as part of its input path that it blows the heap.  But this is the wrong place to implement
that type of fix...)

> Intrinsic limits for HDFS files, directories
> --------------------------------------------
>
>                 Key: HDFS-1751
>                 URL: https://issues.apache.org/jira/browse/HDFS-1751
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node
>    Affects Versions: 0.22.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>             Fix For: 0.23.0
>
>         Attachments: HDFS-1751-2.patch, HDFS-1751-3.patch, HDFS-1751-4.patch, HDFS-1751.patch
>
>
> Enforce a configurable limit on:
>   the length of a path component
>   the number of names in a directory
> The intention is to prevent a too-long name or a too-full directory. This is not about
RPC buffers, the length of command lines, etc. There may be good reasons for those kinds of
limits, but that is not the intended scope of this feature. Consequently, a reasonable implementation
might be to extend the existing quota checker so that it faults the creation of a name that
violates the limits. This strategy of faulting new creation evades the problem of existing
names or directories that violate the limits.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message