hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13341) Deprecate HADOOP_SERVERNAME_OPTS; replace with (command)_(subcommand)_OPTS
Date Fri, 09 Sep 2016 10:59:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15476796#comment-15476796

ASF GitHub Bot commented on HADOOP-13341:

Github user aw-was-here commented on a diff in the pull request:

    --- Diff: hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md ---
    @@ -24,14 +24,26 @@ Apache Hadoop has many environment variables that control various
aspects of the
    -This environment variable is used for almost all end-user operations.  It can be used
to set any Java options as well as any Apache Hadoop options via a system property definition.
For example:
    +This environment variable is used for all end-user, non-daemon operations.  It can be
used to set any Java options as well as any Apache Hadoop options via a system property definition.
For example:
     HADOOP_CLIENT_OPTS="-Xmx1g -Dhadoop.socks.server=localhost:4000" hadoop fs -ls /tmp
     will increase the memory and send this command via a SOCKS proxy server.
    +### `(command)_(subcommand)_OPTS`
    +It is also possible to set options on a per subcommand basis.  This allows for one to
create special options for particular cases.  The first part of the pattern is the command
being used, but all uppercase.  The second part of the command is the subcommand being used.
 Then finally followed by the string `_OPT`.
    +For example, to configure `mapred distcp` to use a 2GB heap, one would use:
    +These options will appear *after* `HADOOP_CLIENT_OPTS` during execution and will generally
take precedence.
    --- End diff --
    If there is an Xmx in HADOOP_CLIENT_OPTS and an Xmx in MAPRED_DISTCP_OPTS, then the mapred
distcp final HADOOP_OPTS will definitely have two Xmx flags.  After HADOOP-13365, we'll be
in a position to potentially de-dupe user provided settings like we do for other things. 
But until de-dupe, you're correct that it's a JVM decision.  In the past, that decision has
been last one wins and I doubt Oracle could change it if they wanted to at this point without
major ramifications.

> Deprecate HADOOP_SERVERNAME_OPTS; replace with (command)_(subcommand)_OPTS
> --------------------------------------------------------------------------
>                 Key: HADOOP-13341
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13341
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: scripts
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Allen Wittenauer
>            Assignee: Allen Wittenauer
>         Attachments: HADOOP-13341.00.patch
> Big features like YARN-2928 demonstrate that even senior level Hadoop developers forget
that daemons need a custom _OPTS env var.  We can replace all of the custom vars with generic
handling just like we do for the username check.
> For example, with generic handling in place:
> || Old Var || New Var ||
> This makes it:
> a) consistent across the entire project
> b) consistent for every subcommand
> c) eliminates almost all of the custom appending in the case statements
> It's worth pointing out that subcommands like distcp that sometimes need a higher than
normal client-side heapsize or custom options are a huge win.  Combined with .hadooprc and/or
dynamic subcommands, it means users can easily do customizations based upon their needs without
a lot of weirdo shell aliasing or one line shell scripts off to the side.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message