hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Roelofs (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-6997) 'hadoop' script should set LANG or LC_COLLATE explicitly for classpath order
Date Tue, 12 Oct 2010 04:29:34 GMT
'hadoop' script should set LANG or LC_COLLATE explicitly for classpath order
----------------------------------------------------------------------------

                 Key: HADOOP-6997
                 URL: https://issues.apache.org/jira/browse/HADOOP-6997
             Project: Hadoop Common
          Issue Type: Bug
          Components: scripts
    Affects Versions: 0.21.0, 0.20.2, 0.22.0
            Reporter: Greg Roelofs


The 'hadoop' script builds the classpath in pieces, including the following bit for the bulk
of it:

{noformat}
# add libs to CLASSPATH
for f in $HADOOP_HOME/lib/*.jar; do
  CLASSPATH=${CLASSPATH}:$f;
done
{noformat}

The ordering of "*.jar", i.e., the collation order, depends on either LANG or LC_COLLATE on
Linux systems.  In the absence of either one, the script will default to whatever the user's
environment specifies; for Red Hat, the default is "en_US", which is a case-insensitive (and
punctuation-insensitive?) ordering.  If LANG is set to "C" instead, the ordering changes to
the ASCII/UTF-8 byte ordering.

The key issue here is that $HADOOP_HOME/lib contains both upper- and lowercase jar names (e.g.,
"SimonTool.jar" and "commons-logging-1.1.1.jar", to pick a completely random pair), which
will have an inverted order depending on which setting is used.

'hadoop' should explicitly set LANG and/or LC_COLLATE to whatever setting it's implicitly
assuming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message