hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3513) HttpFS should cache filesystems
Date Thu, 07 Jun 2012 16:54:23 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291123#comment-13291123

Alejandro Abdelnur commented on HDFS-3513:

It is a bit more complicated than that, for some reason (if I recall correctly this was for
some JT requirements) the FileSystem cache is per UGI but UGI uses object equality instead
or username equality. so it you have 2 UGIs for the same user you end up with 2 FileSystem
instances the FileSystem cache. (run into this problem with Oozie when hadoop security came
to be and I was told that is the way it has to be). But going back to this patch, you won't
have leaks, the entries in the fsCache in HttpFS remain in the cache by their FileSystem instances
are closed after the time out. the worse case scenario I'm referring to, it is about the number
of CachedFileSystem entires in the fsCache in HttpFS, this map that serves as cached is not
being purged of entries, but the FileSystem instances in the entries are certainly closed
after time out, thus no sockets  TIME_WAIT (specially because with this cache HttpFS is quite
aggressive on closing the FileSystem instances).

Hope this clarifies.
> HttpFS should cache filesystems
> -------------------------------
>                 Key: HDFS-3513
>                 URL: https://issues.apache.org/jira/browse/HDFS-3513
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: HDFS-3513.patch, HDFS-3513.patch
> HttpFS opens and closes a FileSystem instance against the backend filesystem (typically
HDFS) on every request. The FileSystem caching is not used as it does not have expiration/timeout
and filesystem instances in there live forever, for long running services like HttpFS this
is not a good thing as it would keep connections open to the NN.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message