hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-6605) Add JAVA_HOME detection to hadoop-config
Date Sun, 12 Jun 2011 06:03:51 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Eli Collins updated HADOOP-6605:

    Attachment: hadoop-6605-3.patch

Thanks for the feedback everyone. Popping the stack, Hadoop requires the user set JAVA_HOME
for two reasons:

# We want to add tools.jar to the classpath, and JAVA_HOME let's the user specify a base directory
to look (other than the default java which may be from a JRE and therefore not have tools.jar).
This is no longer an issue since HADOOP-7374 removed it.
# We want to respect JAVA_HOME even if there is already a java in the path. Ie users and admins
can easily configure which java should be used with Hadoop that's different from the default
system java. This makes sense given that Hadoop is picky. Therefore it makes sense to only
auto-detect JAVA_HOME if it is not set (which all versions of the patch do) and we can determine
a reasonable value.

On OSX, they provide an API (java_home(1)) that does this (returns a path suitable for setting
JAVA_HOME based on enabled/preferred JVM'S as set by Java Preferences). I think we agree it
makes sense to use this.

On Linux, there is no single API that works across distributions. Even though alternatives
is widely available it works differently on different distriubtions (also, it indicates where
the java binary lives, not where JAVA_HOME is, though you could determine that with readlink).
There are well-known locations where JAVA_HOME is installed that you can check to reasonably
detect it. This is the approach taken by the previous patch. I've provided data that shows
that checking a set of directories does not measurably impact the execution time (therefore
"too much work" sounds like a philosophical objection rather than a technical objection to
me). I've found that globbing is not an issue in practice because the glob does not match
more than one installation on a given system. This is because the JDK was resolved via a packaging
dependency and the package updates itself rather than having multiple versions installed.
People who manually install multiple JDKs typically set JAVA_HOME explicitly and therefore
the detection is not used. There are no alternative proposals for autodetecting JAVA_HOME
on Linux, and I'm not going to spend any more time on this part for now so I'm dropping this
case from the patch.

In any case (ha), there is consensus on the OSX approach so let's just go with this for now.
We can easily implement cases for other OS types in the future if there's an approach that's
acceptable. Patch attached.

> Add JAVA_HOME detection to hadoop-config
> ----------------------------------------
>                 Key: HADOOP-6605
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6605
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Chad Metcalf
>            Assignee: Eli Collins
>            Priority: Minor
>             Fix For: 0.22.0
>         Attachments: HADOOP-6605.patch, hadoop-6605-1.patch, hadoop-6605-2.patch, hadoop-6605-3.patch
> The commands that source hadoop-config.sh currently bail with an error if JAVA_HOME is
not set. Let's detect JAVA_HOME (from a list of locations on various OS types) if JAVA_HOME
is not already set by hadoop-env.sh or the environment. This way users don't have to manually
configure it.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message