hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8490) Add Configuration to FileSystem cache key
Date Thu, 07 Jun 2012 15:20:23 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291068#comment-13291068

Daryn Sharp commented on HADOOP-8490:

Not honoring the given conf causes the obvious problem of not being able to tweak values.

It's also causing problem for the NM.  When an app is done, it should be able to call {{FileSystem.closeAllForUGI}}
just like the JT does.  Unfortunately that may pull the rug out from under another app for
that user also running on the NM.  It also means that multiple jobs for the same user are
erroneously using the first job's conf.  Both are probably latent issues in the JT but go
unnoticed or are masked by retries.

Ideally the {{hashCode}} or {{identityHashCode}} would be added to the cache key.  A key/value
equivalence test should not be performed because seemingly identical confs (ex. cloned from
each other) would initially appear the same but may later change.  One potential issue is
cloned confs that really should be the same -- ex. yarn often creates a {{YarnConfiguration(conf)}}.
 This won't be a problem if the conversion is done once and stashed.  If it's done on the
fly multiple times, then it does present a problem.  Arguably that would be a bug but it would
be difficult to fix in a timely manner.

So an alternative is to add a key to the conf (ex. {{fs.cache-id}}) that can be used in the
fs cache key.  This would allow partitioning of the cache, albeit imperfectly, that would
account for cloned confs that should be treated the same.  The onus is placed upon the caller
to explicitly change the key when needed, but it would be more transparent for existing code.

I'll wait for comments before preceding.
> Add Configuration to FileSystem cache key
> -----------------------------------------
>                 Key: HADOOP-8490
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8490
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
> The {{FileSystem#get(URI, Configuration}} does not take the given {{Configuration}} into
consideration before returning an existing fs instance from the cache with a possibly different

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message