hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9902) Shell script rewrite
Date Thu, 20 Feb 2014 17:46:26 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907228#comment-13907228

Allen Wittenauer commented on HADOOP-9902:

Let's talk about the 'classpath' subcommand. 

Today, hadoop classpath returns the classpath of common, hdfs, and yarn.  To me, this seems
to be the wrong behavior for a two major reasons:

* Common has to have to knowledge about subsystems that rely upon it.  Ultimately, this is
a reverse dependency and (I hope) we can all agree those are bad.
* If I'm building an application that only needs access to (common|hdfs|yarn|mapreduce), my
classpath is polluted with extra garbage from the other subproject(s) that may or may not
need.  (yarn does offer a classpath subcommand but it's essentially the same thing as the
hadoop classpath.  The mapreduce classpath is... yeah...)

On the plus side, it's one stop shopping.  "Hooray! I get everything!", some developer likely
said somewhere.

So I'd like to throw out a proposal.

I want to re-implement the classpath subcommand such that (hadoop|hdfs|yarn) only return the
base classpath for their project.  This is (obviously) an incompatible change. Someone who
wanted to know what all the classpaths were for all the projects would be required to run
all the commands.

To make up for it, however, I believe I can *easily* introduce a classpath subcommand for
*every* command that uses the common framework.  For the non-major commands, I suspect this
would be a massive win for debugging.  "What the heck is start-dfs.sh using when it fires
up the namenode?", said myself many many times but using more curse words, some of which you
might not have heard before.  

Another choice might be to have some tricky logic to have subprojects 'register' into the
main project on install such that commands like 'hadoop classpath' now know about those subprojects.
 It won't solve the second bullet point, but it does fix the first.


> Shell script rewrite
> --------------------
>                 Key: HADOOP-9902
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9902
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: scripts
>    Affects Versions: 3.0.0, 2.1.1-beta
>            Reporter: Allen Wittenauer
>            Assignee: Allen Wittenauer
>         Attachments: HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt
> Umbrella JIRA for shell script rewrite.  See more-info.txt for more details.

This message was sent by Atlassian JIRA

View raw message