hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5158) add command-line support for manipulating cache directives
Date Wed, 11 Sep 2013 06:54:51 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764061#comment-13764061

Aaron T. Myers commented on HDFS-5158:

bq. "CachedPath" would be a misleading name, since we may or may not actually be able to cache
the path entries in PathCache. Resources aren't infinite. Bear in mind that there are going
to be other caches that don't operate by path name-- one example is the LRU cache we've talked
about. PathCache, as well as PathCacheDirective, PathCacheEntry, etc. are named the way they
are to distinguish them from the (future) LruCacheDirective, etc. classes which don't exist

Even with this justification and the context of the future "Lru*" classes, there's just no
way you can get away from people interpreting "path cache" to mean "a cache of paths," which
I find to be vastly more misleading/confusing than "CachedPath" would be. How do you feel
about "CachePath" (no 'd') as I also suggested? That doesn't necessarily imply that anything
is already cached, and also appears to work with the future classes you mentioned here, e.g.
CacheLruDirective, CacheLruEntry, etc.

bq. For example, Impala or Hive may want to add many cache directives at once.

I'm a tad skeptical this will in fact be the case, given that the caching directive can potentially
provide a directory as the path, but it seems fairly harmless to leave in the RPCs that take
lists as arguments.

bq. We discussed this on HDFS-5052. The short summary is that paths don't uniquely identify
path cache directives. You can have multiple directives that apply to the same path.

Could you even have multiple directives for the same path within a single pool? Or would the
(pool, path) pair uniquely identify the cache directive?

bq. I did not change the prefix in this patch. Does it make sense to put the prefix change
stuff in another JIRA? It seems like it will be a bigger effort, if we're moving the -addCachePool,
etc. commands as well.

I'd personally do it in this JIRA; it really shouldn't be that much work. But, if you feel
strongly about it, you can do it in a separate JIRA if you want.
> add command-line support for manipulating cache directives
> ----------------------------------------------------------
>                 Key: HDFS-5158
>                 URL: https://issues.apache.org/jira/browse/HDFS-5158
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, namenode
>    Affects Versions: HDFS-4949
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5158-caching.003.patch, HDFS-5158-caching.004.patch, HDFS-5158-caching.005.patch,
> We should add command-line support for creating, removing, and listing cache  directives.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message