hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philip Zeyliger (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15019) Hadoop shell script classpath de-duping ignores HADOOP_USER_CLASSPATH_FIRST
Date Wed, 08 Nov 2017 18:37:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244495#comment-16244495
] 

Philip Zeyliger commented on HADOOP-15019:
------------------------------------------

bq. Or, just use 'hadoop classpath' ...

Yep, you're right. I had tried and failed, but I must have gotten something else wrong.

> Hadoop shell script classpath de-duping ignores HADOOP_USER_CLASSPATH_FIRST 
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-15019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15019
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: bin
>            Reporter: Philip Zeyliger
>
> If a user sets {{HADOOP_USER_CLASSPATH_FIRST=true}} and furthermore includes a directory
that's already in Hadoop's classpath via {{HADOOP_CLASSPATH}}, that directory will appear
later than it should in the eventual $CLASSPATH. I believe this is because the de-duping at
https://github.com/apache/hadoop/blob/cbc632d9abf08c56a7fc02be51b2718af30bad28/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh#L1200
is ignoring the "before/after" parameter.
> My way of reproduction, first build the following trivial Java program:
> {code}
> $cat Test.java
> public class Test {
>   public static void main(String[]args) {
>     System.out.println(System.getenv().get("CLASSPATH"));
>   }
> }
> $javac Test.java
> $jar cf test.jar Test.class
> {code}
> With that, if you happen to have an entry in HADOOP_CLASSPATH that matches what Hadoop
would produce, you'll find the ordering not honored. It's easiest to reproduce this with a
match for HADOOP_CONF_DIR, as in the second case below:
> {code}
> # As you'd expect, /usr/share is first!
> $HADOOP_CONF_DIR=/etc HADOOP_USER_CLASSPATH_FIRST="true" HADOOP_CLASSPATH=/usr/share:/tmp:/bin
bin/hadoop jar test.jar Test | tr ':' '\n' | grep -n . | grep '/usr/share'
> WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
> 1:/usr/share
> # Surprise! /usr/share is now in the 3rd line, even thought it was first in HADOOP_CLASSPATH.
> $HADOOP_CONF_DIR=/usr/share HADOOP_USER_CLASSPATH_FIRST="true" HADOOP_CLASSPATH=/usr/share:/tmp:/bin
bin/hadoop jar test.jar Test | tr ':' '\n' | grep -n . | grep '/usr/share'
> WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
> 3:/usr/share
> {code}
> To re-iterate, what's surprising is that you can make an entry that's first in HADOOP_USER_CLASSPATH
show up not first in the resulting classpath.
> I ran into this configuring {{bin/hive}} with a confdir that was being used for both
HDFS and Hive, and flailing as to why my {{log4j2.properties}} wasn't being read. The one
in my conf dir was lower in my classpath than one bundled in some Hive jar.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message