hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "ShellScriptProgrammingGuide" by SomeOtherAccount
Date Fri, 15 Aug 2014 13:59:59 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "ShellScriptProgrammingGuide" page has been changed by SomeOtherAccount:
https://wiki.apache.org/hadoop/ShellScriptProgrammingGuide?action=diff&rev1=12&rev2=13

  
   5. At this point, this is where the majority of your code goes.  Programs should process
the rest of the arguments and doing whatever their script is supposed to do.
  
-  6. Before executing a Java program or giving user output, call `hadoop_finalize`.  This
finishes up the configuration details: adds the user class path, fixes up any missing Java
properties, configures library paths, etc.
+  6. Before executing a Java program (preferably via hadoop_java_exec) or giving user output,
call `hadoop_finalize`.  This finishes up the configuration details: adds the user class path,
fixes up any missing Java properties, configures library paths, etc.  
  
   7. Either an `exit` or an `exec`.  This should return 0 for success and 1 or higher for
failure.
  
@@ -36, +36 @@

  
    c. For methods that can also be daemons, set `daemon=true`.  This will allow for the `--daemon`
option to work.
  
-   d. For HDFS daemons, if it supports security, set `secure_service=true` and `secure_user`
equal to the user that should run the daemon.
+   d. If it supports security, set `secure_service=true` and `secure_user` equal to the user
that should run the daemon.
+ 
+  3. If a new subcommand needs one or more extra environment variables:
+ 
+   a. Add documentation and a '''commented''' out example that shows the default setting.
+ 
+   b. Add the default(s) to that subprojects' hadoop_subproject_init or hadoop_basic_init
for common, using the current shell vars as a guide. (Specifically, it should allow overriding!)

+ 
  
  = Better Practices =
  
-  * Avoid adding more globals or project specific globals and/or entries in *-env.sh.  In
a lot of cases, there is pre-existing functionality that already does what you might need
to do.  Additionally, every configuration option makes it that much harder for end users.
If you do need to add a new global variable for additional functionality, start it with HADOOP_
for common, HDFS_ for HDFS, YARN_ for YARN, and MAPRED_ for MapReduce.  It should be documented
in either *-env.sh (for user overridable parts) or hadoop-functions.sh (for internal-only
globals). This helps prevents our variables from clobbering other people.
+  * Avoid adding more globals or project specific globals and/or entries in *-env.sh and/or
a comment at the bottom here.  In a lot of cases, there is pre-existing functionality that
already does what you might need to do.  Additionally, every configuration option makes it
that much harder for end users. If you do need to add a new global variable for additional
functionality, start it with HADOOP_ for common, HDFS_ for HDFS, YARN_ for YARN, and MAPRED_
for MapReduce.  It should be documented in either *-env.sh (for user overridable parts) or
hadoop-functions.sh (for internal-only globals). This helps prevents our variables from clobbering
other people.
  
-  * Remember that abc_xyz_OPTS can and should act as a catch-all for Java daemon options.
 Custom heap environment variables add unnecessary complexity for both the user and us.
+  * Remember that abc_xyz_OPTS can and should act as a catch-all for Java daemon options.
 Custom heap environment variables add unnecessary complexity for both the user and us.  They
should be avoided.
  
   * Avoid mutli-level `if`'s where the comparisons are static strings.  Use case statements
instead, as they are easier to read.
  
-  * BSDisms, GNUisms, or SysVisms. Double check your esoteric command and parameters on multiple
operating systems.  (Usually a quick Google search will pull up man pages for other OSes.)
+  * BSDisms, GNUisms, or SysVisms. Double check your command, parameters, and/or reading
of OS files on multiple operating systems.  (Usually a quick Google search will pull up man
pages for other OSes.) In particular, check out Linux, OS X, FreeBSD, Solaris, and AIX. It's
reasonable to expect code to work for approximately three-five years.  Also take note of hadoop_os_tricks,
where OS-specific start up stuff can go, so long as a user would never want to change it...
  
   * Output to the screen, especially for daemons, should be avoided.  No one wants to see
a multitude of messages during startup.  Errors should go to STDERR instead of STDOUT. Use
the `hadoop_error` function to make it clear in the code.
  
@@ -56, +63 @@

  
   * The [[http://wiki.bash-hackers.org/scripting/style|Bash Hackers website]] and [[https://google-styleguide.googlecode.com/svn/trunk/shell.xml|Google]]
have great general advice for style guidlines in bash.  Additionally, Paul Lutus's [[http://www.arachnoid.com/python/beautify_bash_program.html|Beautify
Bash]] does a tremendously good job reformatting bash.
  
-  * A decent shell lint is available at http://www.shellcheck.net .  Mac users can `brew
install shellcheck` to install it locally. Like lint, however, be aware that it will sometimes
flag things that are legitimate.   
+  * A decent shell lint is available at http://www.shellcheck.net .  Mac users can `brew
install shellcheck` to install it locally. Like lint, however, be aware that it will sometimes
flag things that are legitimate. These can be marked using a 'shellcheck disable' comment.
(Usually, the flag for $HADOOP_OPTS being called without quotes is our biggest offense that
shellcheck flags.  Our usage without quotes is correct for the current code base.  It is,
however, a bad practice and shellcheck is correct for telling us about it.)
  
  = Standard Environment Variables =
  

Mime
View raw message