hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HADOOP-14976) Allow overriding HADOOP_SHELL_EXECNAME
Date Tue, 24 Oct 2017 21:11:01 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217705#comment-16217705
] 

Allen Wittenauer edited comment on HADOOP-14976 at 10/24/17 9:10 PM:
---------------------------------------------------------------------

bq. since the calling script always knows what is necessary? 

I'd need to be convinced this is true.  A lot of the work done in the shell script rewrite
and follow on work was to make the "front end" scripts as dumb as possible in order to centralize
the program logic.  This gave huge benefits in the form of script consistency, testing, and
more.

Besides, EXECNAME is used for *very* specific things:

e.g.:

https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L67
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh#L20

are great examples where the execname is exactly what needs to be reported. 

.. and that's even before 3rd party add-ons that might expect HADOOP_SHELL_EXECNAME to work
as expected.


If distributions really are renaming the scripts (which is extremely problematic for lots
of reasons), there isn't much of a reason they couldn't just tuck them away in a non-PATH
directory and use the same names or even just rewrite the scripts directly.  (See above about
removing as much logic as possible.)

I've had in my head a "vendor" version of hadoop-user-function.sh, but I'm not sure if even
that would help here.  It really depends upon the why the bin scripts are getting renamed,
if the problem being solved is actually more appropriate for hadoop-layout.sh, etc.

I see nothing but pain and misfortune for mucking with HADOOP_SHELL_EXECNAME though.


was (Author: aw):
bq. since the calling script always knows what is necessary? 

I'd need to be convinced this is true.  A lot of the work done in the shell script rewrite
and follow on work was to make the "front end" scripts as dumb as possible in order to centralize
the program logic.  This gave huge benefits in the form of script consistency, testing, and
more.

Besides, CLASSNAME and EXECNAME are used for *very* different things and aren't guaranteed
to match.

e.g.:

https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L67
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh#L20

are great examples where the execname is exactly what needs to be reported. 

.. and that's even before 3rd party add-ons that might expect HADOOP_SHELL_EXECNAME to work
as expected.


If distributions really are renaming the scripts (which is extremely problematic for lots
of reasons), there isn't much of a reason they couldn't just tuck them away in a non-PATH
directory and use the same names or even just rewrite the scripts directly.  (See above about
removing as much logic as possible.)

I've had in my head a "vendor" version of hadoop-user-function.sh, but I'm not sure if even
that would help here.  It really depends upon the why the bin scripts are getting renamed,
if the problem being solved is actually more appropriate for hadoop-layout.sh, etc.

I see nothing but pain and misfortune for mucking with HADOOP_SHELL_EXECNAME though.

> Allow overriding HADOOP_SHELL_EXECNAME
> --------------------------------------
>
>                 Key: HADOOP-14976
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14976
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Arpit Agarwal
>
> Some Hadoop shell scripts infer their own name using this bit of shell magic:
> {code}
>  18     MYNAME="${BASH_SOURCE-$0}"
>  19     HADOOP_SHELL_EXECNAME="${MYNAME##*/}"
> {code}
> e.g. see the [hdfs|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs#L18]
script.
> The inferred shell script name is later passed to _hadoop-functions.sh_ which uses it
to construct the names of some environment variables. E.g. when invoking _hdfs datanode_,
the options variable name is inferred as follows:
> {code}
> # HDFS + DATANODE + OPTS -> HDFS_DATANODE_OPTS
> {code}
> This works well if the calling script name is standard {{hdfs}} or {{yarn}}. If a distribution
renames the script to something like foo.bar, , then the variable names will be inferred as
{{FOO.BAR_DATANODE_OPTS}}. This is not a valid bash variable name.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message