hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From a.@apache.org
Subject [18/21] hadoop git commit: HADOOP-13360. Documentation for HADOOP_subcommand_OPTS
Date Fri, 09 Sep 2016 11:08:48 GMT
HADOOP-13360. Documentation for HADOOP_subcommand_OPTS

Signed-off-by: Allen Wittenauer <aw@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/b417ba32
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/b417ba32
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/b417ba32

Branch: refs/heads/HADOOP-13341
Commit: b417ba32d4d202bf0bc43ad5b6d17336b3b58ee6
Parents: d294200
Author: Allen Wittenauer <aw@apache.org>
Authored: Wed Aug 31 07:39:34 2016 -0700
Committer: Allen Wittenauer <aw@apache.org>
Committed: Fri Sep 9 04:07:43 2016 -0700

----------------------------------------------------------------------
 .../src/site/markdown/ClusterSetup.md           | 19 ++++-------
 .../src/site/markdown/UnixShellGuide.md         | 34 +++++++++++++++++---
 .../src/site/markdown/HdfsNfsGateway.md         |  2 +-
 3 files changed, 37 insertions(+), 18 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/b417ba32/hadoop-common-project/hadoop-common/src/site/markdown/ClusterSetup.md
----------------------------------------------------------------------
diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/ClusterSetup.md b/hadoop-common-project/hadoop-common/src/site/markdown/ClusterSetup.md
index 0d551b1..f222769 100644
--- a/hadoop-common-project/hadoop-common/src/site/markdown/ClusterSetup.md
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/ClusterSetup.md
@@ -64,17 +64,17 @@ Administrators can configure individual daemons using the configuration
options
 
 | Daemon | Environment Variable |
 |:---- |:---- |
-| NameNode | HADOOP\_NAMENODE\_OPTS |
-| DataNode | HADOOP\_DATANODE\_OPTS |
-| Secondary NameNode | HADOOP\_SECONDARYNAMENODE\_OPTS |
+| NameNode | HDFS\_NAMENODE\_OPTS |
+| DataNode | HDFS\_DATANODE\_OPTS |
+| Secondary NameNode | HDFS\_SECONDARYNAMENODE\_OPTS |
 | ResourceManager | YARN\_RESOURCEMANAGER\_OPTS |
 | NodeManager | YARN\_NODEMANAGER\_OPTS |
 | WebAppProxy | YARN\_PROXYSERVER\_OPTS |
-| Map Reduce Job History Server | HADOOP\_JOB\_HISTORYSERVER\_OPTS |
+| Map Reduce Job History Server | MAPRED\_HISTORYSERVER\_OPTS |
 
-For example, To configure Namenode to use parallelGC, the following statement should be added
in hadoop-env.sh :
+For example, To configure Namenode to use parallelGC and a 4GB Java Heap, the following statement
should be added in hadoop-env.sh :
 
-      export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC"
+      export HDFS_NAMENODE_OPTS="-XX:+UseParallelGC -Xmx4g"
 
 See `etc/hadoop/hadoop-env.sh` for other examples.
 
@@ -91,13 +91,6 @@ It is also traditional to configure `HADOOP_HOME` in the system-wide shell
envir
       HADOOP_HOME=/path/to/hadoop
       export HADOOP_HOME
 
-| Daemon | Environment Variable |
-|:---- |:---- |
-| ResourceManager | YARN\_RESOURCEMANAGER\_HEAPSIZE |
-| NodeManager | YARN\_NODEMANAGER\_HEAPSIZE |
-| WebAppProxy | YARN\_PROXYSERVER\_HEAPSIZE |
-| Map Reduce Job History Server | HADOOP\_JOB\_HISTORYSERVER\_HEAPSIZE |
-
 ### Configuring the Hadoop Daemons
 
 This section deals with important parameters to be specified in the given configuration files:

http://git-wip-us.apache.org/repos/asf/hadoop/blob/b417ba32/hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md
----------------------------------------------------------------------
diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md b/hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md
index 940627d..b130f0f 100644
--- a/hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md
@@ -24,7 +24,7 @@ Apache Hadoop has many environment variables that control various aspects
of the
 
 ### `HADOOP_CLIENT_OPTS`
 
-This environment variable is used for almost all end-user operations.  It can be used to
set any Java options as well as any Apache Hadoop options via a system property definition.
For example:
+This environment variable is used for all end-user, non-daemon operations.  It can be used
to set any Java options as well as any Apache Hadoop options via a system property definition.
For example:
 
 ```bash
 HADOOP_CLIENT_OPTS="-Xmx1g -Dhadoop.socks.server=localhost:4000" hadoop fs -ls /tmp
@@ -32,6 +32,18 @@ HADOOP_CLIENT_OPTS="-Xmx1g -Dhadoop.socks.server=localhost:4000" hadoop
fs -ls /
 
 will increase the memory and send this command via a SOCKS proxy server.
 
+### `(command)_(subcommand)_OPTS`
+
+It is also possible to set options on a per subcommand basis.  This allows for one to create
special options for particular cases.  The first part of the pattern is the command being
used, but all uppercase.  The second part of the command is the subcommand being used.  Then
finally followed by the string `_OPT`.
+
+For example, to configure `mapred distcp` to use a 2GB heap, one would use:
+
+```bash
+MAPRED_DISTCP_OPTS="-Xmx2g"
+```
+
+These options will appear *after* `HADOOP_CLIENT_OPTS` during execution and will generally
take precedence.
+
 ### `HADOOP_CLASSPATH`
 
   NOTE: Site-wide settings should be configured via a shellprofile entry and permanent user-wide
settings should be configured via ${HOME}/.hadooprc using the `hadoop_add_classpath` function.
See below for more information.
@@ -56,6 +68,8 @@ For example:
 #
 
 HADOOP_CLIENT_OPTS="-Xmx1g"
+MAPRED_DISTCP_OPTS="-Xmx2g"
+HADOOP_DISTCP_OPTS="-Xmx2g"
 ```
 
 The `.hadoop-env` file can also be used to extend functionality and teach Apache Hadoop new
tricks.  For example, to run hadoop commands accessing the server referenced in the environment
variable `${HADOOP_SERVER}`, the following in the `.hadoop-env` will do just that:
@@ -71,11 +85,23 @@ One word of warning:  not all of Unix Shell API routines are available
or work c
 
 ## Administrator Environment
 
-There are many environment variables that impact how the system operates.  By far, the most
important are the series of `_OPTS` variables that control how daemons work.  These variables
should contain all of the relevant settings for those daemons.
+In addition to the various XML files, there are two key capabilities for administrators to
configure Apache Hadoop when using the Unix Shell:
+
+  * Many environment variables that impact how the system operates.  This guide will only
highlight some key ones.  There is generally more information in the various `*-env.sh` files.
+
+  * Supplement or do some platform-specific changes to the existing scripts.  Apache Hadoop
provides the capabilities to do function overrides so that the existing code base may be changed
in place without all of that work.  Replacing functions is covered later under the Shell API
documentation.
+
+### `(command)_(subcommand)_OPTS`
+
+By far, the most important are the series of `_OPTS` variables that control how daemons work.
 These variables should contain all of the relevant settings for those daemons.
+
+Similar to the user commands above, all daemons will honor the `(command)_(subcommand)_OPTS`
pattern.  It is generally recommended that these be set in `hadoop-env.sh` to guarantee that
the system will know which settings it should use on restart.  Unlike user-facing subcommands,
daemons will *NOT* honor `HADOOP_CLIENT_OPTS`.
+
+In addition, daemons that run in an extra security mode also support `(command)_(subcommand)_SECURE_EXTRA_OPTS`.
 These options are *supplemental* to the generic `*_OPTS` and will appear after, therefore
generally taking precedence.
 
-More, detailed information is contained in `hadoop-env.sh` and the other env.sh files.
+### `(command)_(subcommand)_USER`
 
-Advanced administrators may wish to supplement or do some platform-specific fixes to the
existing scripts.  In some systems, this means copying the errant script or creating a custom
build with these changes.  Apache Hadoop provides the capabilities to do function overrides
so that the existing code base may be changed in place without all of that work.  Replacing
functions is covered later under the Shell API documentation.
+Apache Hadoop provides a way to do a user check per-subcommand.  While this method is easily
circumvented and should not be considered a security-feature, it does provide a mechanism
by which to prevent accidents.  For example, setting `HDFS_NAMENODE_USER=hdfs` will make the
`hdfs namenode` and `hdfs --daemon start namenode` commands verify that the user running the
commands are the hdfs user by checking the `USER` environment variable.  This also works for
non-daemons.  Setting `HADOOP_DISTCP_USER=jane` will verify that `USER` is set to `jane` before
being allowed to execute the `hadoop distcp` command.
 
 ## Developer and Advanced Administrator Environment
 

http://git-wip-us.apache.org/repos/asf/hadoop/blob/b417ba32/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsNfsGateway.md
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsNfsGateway.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsNfsGateway.md
index 6731189..4742637 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsNfsGateway.md
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsNfsGateway.md
@@ -183,7 +183,7 @@ It's strongly recommended for the users to update a few configuration
properties
         </property>
 
 *   JVM and log settings. You can export JVM settings (e.g., heap size and GC log) in
-    HADOOP\_NFS3\_OPTS. More NFS related settings can be found in hadoop-env.sh.
+    HDFS\_NFS3\_OPTS. More NFS related settings can be found in hadoop-env.sh.
     To get NFS debug trace, you can edit the log4j.property file
     to add the following. Note, debug trace, especially for ONCRPC, can be very verbose.
 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-commits-help@hadoop.apache.org


Mime
View raw message