hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13335) Add an option to suppress the 'use yarn jar' warning or remove it
Date Wed, 06 Jul 2016 19:59:11 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364973#comment-15364973

Siddharth Seth commented on HADOOP-13335:

bq. It's going to be hard to take yarn jar or hadoop jar away. It's doubtful they will ever
get removed. That said, we can at least make them act and work the same way. To me, that's
the ultimate goal and it's pretty close to what happens in trunk:
1. yarn command sucks in yarn-env.sh, hadoop-env.sh, yarn-config.sh and hadoop-config.sh in
a way that should be mostly conflict-free. (non-yarn commands do not pull in yarn-x.sh, obviously)
2. If YARN_OPTS is defined, yarn x (jar, rmadmin, etc) will use it but throw a deprecation
3. Otherwise use HADOOP_OPTS

It is going to be hard to remove either of these. I don't know when something that has been
deprecated in a previous release has actually been removed.
If hadoop jar and yarn jar should behave the same - and 1) 'yarn jar' is not the preferred
usage, or 2) 'yarn jar' behaves as an alias - I'd be in favor of removing this warnings altogether.
Don't encourage users to use yarn jar over hadoop jar/ don't advertise yarn jar.

In terms of removing the warning - I'm all for it, and is the preferred approach to 'fix'
this jar (that's the first suggestion in the description).

This is arguable - but printing a new warning in the 2.7.0 release can be considered to be
an incompatible change. Other than being an annoyance to users - Hive silent mode is broken
by this, since it was not written to work with output from the hadoop command.

Lets keep the change / detailed recommendation in the 'help' output - and remove it from hadoop
jar invocation.

bq. Very long term (post-3.x), it would probably be better if hive called hadoop-config.sh
and/or hadoop-functions.sh directly. This would bypass the middleman and give much better
control. I'd be very interested to hear what sort of holes we have in the functionality here
that makes this hard/impossible. Off the top, I suspect we need to make one big function of
the series of function calls in hadoop-config.sh, but would love to hear your insight on this.
I don't know enough about the internals of the scripts to have an educated opinion on this.
hadoop scripts, required environment variables, how they interact seems quite complicated.
Setting HADOOP_CLIENT_OPTS is apparently the way to change the hiveclient heap size. I would
expect hive to control this independently, and not depend on variables exported by hadoop
scripts. One possible usage is for Hadoop to provide basic information - CLASSPATH, configs.
Products like Hive build on top of this information, rather than trying to use hadoop scripts,
and define their own mechanism for users to specify various environment variables. hadoop
jar is obviously useful for custom tools, which want a simple way to execute on a hadoop cluster.

> Add an option to suppress the 'use yarn jar' warning or remove it
> -----------------------------------------------------------------
>                 Key: HADOOP-13335
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13335
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 2.7.0
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>         Attachments: HADOOP-13335.01.patch, HADOOP-13335.02.patch, HADOOP-13335.02_branch-2.patch,
HADOOP-13335.03.patch, HADOOP-13335.03_branch-2.patch, HADOOP-13335.04.patch
> https://issues.apache.org/jira/browse/HADOOP-11257 added a 'deprecation' warning for
'hadoop jar'.
> hadoop jar is used for a lot more that starting jobs. As an example - hive uses it to
start all it's services (HiveServer2, the hive client, beeline etc).
> Using 'yarn jar' for to start these services / tools doesn't make a lot of sense - there's
no relation to yarn other than requiring the classpath to include yarn libraries.
> I'd propose reverting the changes where this message is printed if YARN variables are
set (leave it in the help message), or adding a mechanism which would allow users to suppress

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message