hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10996) [post-HADOOP-9902] Stop violence in the *_HOME
Date Wed, 27 Aug 2014 03:28:58 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111776#comment-14111776

Allen Wittenauer commented on HADOOP-10996:

TL;DR: Absolute best bet is to put configs some place and assign HADOOP_CONF_DIR to it so
that you have absolute certainty on where Hadoop is pulling settings.  

Longer story:

Currently, if HADOOP_CONF_DIR isn't defined, it uses a bit of twisted logic to locate it:

1. Figure out where HADOOP_PREFIX is at. Is HADOOP_PREFIX defined? If not, then let's assume
it's "what's called us/..".
2. Does HADOOP_PREFIX/conf/hadoop-env.sh exist? OK, then that must be HADOOP_CONF_DIR
3. No? OK, then HADOOP_CONF_DIR must be HADOOP_PREFIX/etc/hadoop.

What's fun about this and what you're doing is that HADOOP_CONF_DIR will get defined differently
depending upon which bin dir you are using. :D

Fine, you say!  Let's just treat all *_HOME/etc/hadoop and *_HOME/conf as potentially valid.
 Now we have a very interesting problem:  how do you define HADOOP_CONF_DIR?  Other stuff
past Hadoop depends upon this being *one* directory.  We could pick the first one and then
just shove the rest in the classpath and none would be the wiser!

Aha! But they would.  Which one takes precedence? What happens if there are conflicts? etc,
etc. It gets messy very very fast. So... ABORT! ABORT!

(BTW, this is pretty much the same logic from branch-2. It could be argued that there should
be a check to see if etc/hadoop is 'real' too and abort on it.  Here's the fun part: the shell
code works perfectly fine if *-env.sh is empty now... the NN will still crash though.  That
said, if HADOOP-10879 gets finished, this will almost certainly need to get revisited.  Probably
better to look for core-site.xml, honestly, since all of the sub-projects all depend upon
that.  In other words, we could run through all of the *_HOME, HADOOP_PREFIX, etc, and use
the first core-site.xml we find as the 'real' HADOOP_CONF_DIR.)

> [post-HADOOP-9902] Stop violence in the *_HOME
> ----------------------------------------------
>                 Key: HADOOP-10996
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10996
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: scripts
>    Affects Versions: 3.0.0
>            Reporter: Allen Wittenauer
>            Assignee: Allen Wittenauer
>         Attachments: HADOOP-10996-01.patch, HADOOP-10996-02.patch, HADOOP-10996.patch
> (Updated from original description)
> There are various places where the various HOME directories are missing or mis-defined.

This message was sent by Atlassian JIRA

View raw message