hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5120) add command-line support for manipulating cache pools
Date Mon, 09 Sep 2013 18:21:52 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13762122#comment-13762122
] 

Colin Patrick McCabe commented on HDFS-5120:
--------------------------------------------

bq. listCachePools javadoc says CachePool, I think you meant CachePoolInfo

fixed

bq. Unnecessary StringUtils import

OK

bq. We moved DEFAULT_MODE from CachePool into the FsPermission class in HDFS-5163, looks like
a rebase error

fixed

bq. Same comment for MODE_FORMAT_STRING, let's just use FsPermission's toString.

fixed

bq. validateName is only called in one place, dubious if we need this as a separate function.

disagree; type validation belongs with the type

bq. Unnecessary Arrays and Iterator imports

fixed

bq. need javadoc on new methods

added

bq. What's the reason for terminating on "--"? It's usually used to separate sets of variadic
positional args, but I don't see that in any of the new methods.

The name of the cache pool is a positional argument in {{addCachePool}}.  I don't want mistyped
options being interpreted as the cache pool name here; if you type "-onwer", you should get
a warning, not a silent misinterpretation.  People who want to use cache pool names that begin
with dash can put them after the double-dash.  Finally, these functions are intended to be
generic, and the double-dash is a standard mechanism.

I considered many alternate solutions here.  CommandFormat only deals with flags, not options
that can take an argument, or positional arguments.  It's not very helpful for this use-case.
 Commons CLI is a bigger framework which replaces a lot of stuff that DFSAdmin.java is doing,
such as providing general and specific help.  The new snapshots stuff didn't extend DFSAdminCommand
either, probably because it's not a very useful base class at the moment.  If you want to
file a JIRA for converting DFSAdmin to using Commons-CLI, I'd be all for it.  That effort
needs to be made for the whole file, not just individual commands, to really make anything
better.

bq. can we split the packing and unpacking of CachePoolInfo in modifyCachePool into new methods?
Function is kind of long right now.

The problem is, those new methods would have to have lots and lots of arguments, since we're
dealing with mode, group, weight, name, owner, etc. etc.  I think it's better as-is.

bq. I think we should table-print like ls -l output instead of printing the field name for
each pool. Should look cleaner. This will require updating the doc too.

I agree that there should be a one (or maybe two) line format available, but it's not going
to be that easy to implement.  We'd need column headers to tell people what they're reading.
 People have been reading the output of ls -l for years, but knowing how to interpret a row
of numbers from dfsadmin will be more challenging.

ls -l seems simple, but it actually uses some tricks to make columns line up properly and
avoid ragged-looking output.  There may already be some code in Hadoop that does this, but
we'll have to track it down and see if it's general enough to be used here.

Also keep in mind that we're going to add a lot more things to the pool info before we're
done.  Statistics about how much data is cached, parameters for controlling resource allocation,
etc. etc.

So on the whole, let's file a JIRA for prettier verbose output, and do that once the main
stuff is out there.

bq. I also don't think we need the non-verbose mode, or it should be inverted as -quiet, since
we're fetching the extra data regardless (unlike ls -l).

It's provided for the convenience of shell scripts, mostly.  I guess a better solution would
be to allow the user to specify a format string, like ls does.  I guess let's remove non-verbose
mode for now and punt that to the same "prettier ls" JIRA.

bq. We can use a positional argument rather than a named one for the name, again like our
friend ls

OK.

bq. Please use one space after a period in the help text, again per convention.

I was midway through changing this when I noticed that actually, two spaces after periods
is the convention elsewhere in the file.  Look at 'String safemode'
{code}
...
      "\t\tpercentage of blocks satisfies the minimum replication\n" +
      "\t\tcondition.  Safe mode can also be entered manually, but then\n" +
      "\t\tit can only be turned off manually as well.\n";
{code}

Also, the ASF license header at the top of the file also uses two spaces after each period
:)  The only case where this isn't true is where a newline immediately follows a period.
                
> add command-line support for manipulating cache pools
> -----------------------------------------------------
>
>                 Key: HDFS-5120
>                 URL: https://issues.apache.org/jira/browse/HDFS-5120
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, namenode
>    Affects Versions: HDFS-4949
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5120-caching.001.patch, HDFS-5120-caching.002.patch
>
>
> We should add command-line support for creating, removing, and listing cache directives
and manipulating cache pools.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message