hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11245) HDFS ignores HADOOP_CONF_DIR
Date Thu, 15 Dec 2016 06:32:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15750542#comment-15750542
] 

Allen Wittenauer commented on HDFS-11245:
-----------------------------------------

The problem is here:

{code}
export HADOOP_CLASSPATH=$(hadoop classpath):"$HADOOP_HOME"/share/hadoop/tools/lib/*
export HADOOP_USER_CLASSPATH_FIRST=true
{code}

IIRC, branch-2 didn't actually process the classpath correctly.  HADOOP\_USER\_CLASSPATH\_FIRST
was only "sorta" true and varied between the different projects.  3.x fixes that and makes
it always true.  With that knowledge, we can break this down a bit.

hadoop classpath gives you all of the hadoop jars.  But what you probably don't know is that
those jars also contain default versions of log4j and the xml files.  Now by forcing the user
classpath first, the default ones override the HADOOP\_CONF\_DIR values.

So for 3.x, do not put hadoop classpath in the path list.  That will always be present. Additionally,
there's not much reason to forcefully add all of the tools dir in the classpath.  This now
configurable via the HADOOP\_OPTIONAL\_TOOLS vars in hadoop-env.sh which will only add the
features you actually care about rather than adding a full kitchen sink of stuff.

> HDFS ignores HADOOP_CONF_DIR
> ----------------------------
>
>                 Key: HDFS-11245
>                 URL: https://issues.apache.org/jira/browse/HDFS-11245
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 3.0.0-alpha1
>         Environment: Linux
>            Reporter: Ewan Higgs
>
> It seems that HDFS on trunk is ignoring {{HADOOP_CONF_DIR}}. On {{branch-2}} I could
export {{HADOOP_CONF_DIR}} and use that to store my {{hdfs-site.xml}} and {{log4j.properties}}.
But on trunk it appears to ignore the environment variable.
> Also, even if hdfs can find the {{log4j.properties}}, it doesn't seem interested in opening
and loading it.
> On Ubuntu 16.10:
> {code}
> $ source env.sh
> $ cat env.sh 
> #!/bin/bash
> export JAVA_HOME=/usr/lib/jvm/java-8-oracle
> export HADOOP_HOME="$HOME"/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT
> export HADOOP_LOG_DIR="$(pwd)/log"
> PATH="$HADOOP_HOME"/bin:$PATH
> export HADOOP_CLASSPATH=$(hadoop classpath):"$HADOOP_HOME"/share/hadoop/tools/lib/*
> export HADOOP_USER_CLASSPATH_FIRST=true
> {code}
> Then I set the HADOOP_CONF_DIR:
> {code}
> $ export HADOOP_CONF_DIR="$(pwd)/conf/nn"
> $ ls $HADOOP_CONF_DIR
> hadoop-env.sh  hdfs-site.xml  log4j.properties
> {code}
> Now, we try to run a namenode:
> {code}
> $ hdfs namenode
> 2016-12-14 14:04:51,193 ERROR [main] namenode.NameNode: Failed to start namenode.
> java.lang.IllegalArgumentException: Invalid URI for NameNode address (check fs.defaultFS):
file:/// has no authority.
>         at org.apache.hadoop.hdfs.DFSUtilClient.getNNAddress(DFSUtilClient.java:648)
>         at org.apache.hadoop.hdfs.DFSUtilClient.getNNAddressCheckLogical(DFSUtilClient.java:677)
>         at org.apache.hadoop.hdfs.DFSUtilClient.getNNAddress(DFSUtilClient.java:639)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.getRpcServerAddress(NameNode.java:556)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser(NameNode.java:687)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:707)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:916)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1633)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1701)
> {code}
> This is weird. We have the {{fs.defaultFS}} set:
> {code}
> $ grep -n2 fs.defaultFS $HADOOP_CONF_DIR/hdfs-site.xml
> 3-<configuration>
> 4-    <property>
> 5:        <name>fs.defaultFS</name>
> 6-        <value>hdfs://localhost:60010</value>
> 7-    </property>
> {code}
> So if isn't finding this config. Where is is looking and finding {{file:///}}?
> {code}
> $ strace -f -eopen,stat hdfs namenode 2>&1 | grep hdfs-site.xml
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/jdiff/hdfs-site.xml",
0x7f05eb6d21d0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/lib/hdfs-site.xml",
0x7f05eb6d21d0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/sources/hdfs-site.xml",
0x7f05eb6d21d0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/templates/hdfs-site.xml",
0x7f05eb6d21d0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/webapps/hdfs-site.xml",
0x7f05eb6d21d0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/jdiff/hdfs-site.xml",
0x7f05eb6d21d0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/lib/hdfs-site.xml",
0x7f05eb6d21d0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/sources/hdfs-site.xml",
0x7f05eb6d21d0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/templates/hdfs-site.xml",
{st_mode=S_IFREG|0664, st_size=775, ...}) = 0
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/templates/hdfs-site.xml",
{st_mode=S_IFREG|0664, st_size=775, ...}) = 0
> [pid 16271] open("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/templates/hdfs-site.xml",
O_RDONLY) = 218
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/jdiff/hdfs-site.xml",
0x7f05eb6d17e0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/lib/hdfs-site.xml",
0x7f05eb6d17e0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/sources/hdfs-site.xml",
0x7f05eb6d17e0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/templates/hdfs-site.xml",
0x7f05eb6d17e0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/webapps/hdfs-site.xml",
0x7f05eb6d17e0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/jdiff/hdfs-site.xml",
0x7f05eb6d17e0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/lib/hdfs-site.xml",
0x7f05eb6d17e0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/sources/hdfs-site.xml",
0x7f05eb6d17e0) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/templates/hdfs-site.xml",
{st_mode=S_IFREG|0664, st_size=775, ...}) = 0
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/templates/hdfs-site.xml",
{st_mode=S_IFREG|0664, st_size=775, ...}) = 0
> [pid 16271] open("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/templates/hdfs-site.xml",
O_RDONLY) = 218
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/jdiff/hdfs-site.xml",
0x7f05eb6d2070) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/lib/hdfs-site.xml",
0x7f05eb6d2070) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/sources/hdfs-site.xml",
0x7f05eb6d2070) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/templates/hdfs-site.xml",
0x7f05eb6d2070) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/webapps/hdfs-site.xml",
0x7f05eb6d2070) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/jdiff/hdfs-site.xml",
0x7f05eb6d2070) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/lib/hdfs-site.xml",
0x7f05eb6d2070) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/sources/hdfs-site.xml",
0x7f05eb6d2070) = -1 ENOENT (No such file or directory)
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/templates/hdfs-site.xml",
{st_mode=S_IFREG|0664, st_size=775, ...}) = 0
> [pid 16271] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/templates/hdfs-site.xml",
{st_mode=S_IFREG|0664, st_size=775, ...}) = 0
> [pid 16271] open("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/templates/hdfs-site.xml",
O_RDONLY) = 218
> {code}
> So it's ignoring {{HADOOP_CONF_DIR}}. We can work around it using {{-conf $(pwd)/conf/nn/hdfs-site.xml}}:
> {code}
> $ strace -f -eopen,stat hdfs namenode -conf $(pwd)/conf/nn/hdfs-site.xml 2>&1
| grep hdfs-site.xml
> [pid 16493] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/jdiff/hdfs-site.xml",
0x7f9f60afb1d0) = -1 ENOENT (No such file or directory)
> [pid 16493] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/lib/hdfs-site.xml",
0x7f9f60afb1d0) = -1 ENOENT (No such file or directory)
> [pid 16493] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/sources/hdfs-site.xml",
0x7f9f60afb1d0) = -1 ENOENT (No such file or directory)
> [pid 16493] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/templates/hdfs-site.xml",
0x7f9f60afb1d0) = -1 ENOENT (No such file or directory)
> [pid 16493] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/common/webapps/hdfs-site.xml",
0x7f9f60afb1d0) = -1 ENOENT (No such file or directory)
> [pid 16493] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/jdiff/hdfs-site.xml",
0x7f9f60afb1d0) = -1 ENOENT (No such file or directory)
> [pid 16493] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/lib/hdfs-site.xml",
0x7f9f60afb1d0) = -1 ENOENT (No such file or directory)
> [pid 16493] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/sources/hdfs-site.xml",
0x7f9f60afb1d0) = -1 ENOENT (No such file or directory)
> [pid 16493] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/templates/hdfs-site.xml",
{st_mode=S_IFREG|0664, st_size=775, ...}) = 0
> [pid 16493] stat("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/templates/hdfs-site.xml",
{st_mode=S_IFREG|0664, st_size=775, ...}) = 0
> [pid 16493] open("/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/share/hadoop/hdfs/templates/hdfs-site.xml",
O_RDONLY) = 218
> [pid 16493] stat("/home/ehigg90120/src/hadoop-run/tutorial/conf/nn/hdfs-site.xml", {st_mode=S_IFREG|0644,
st_size=2107, ...}) = 0
> [pid 16493] open("/home/ehigg90120/src/hadoop-run/tutorial/conf/nn/hdfs-site.xml", O_RDONLY)
= 218
> ------8<------
> {code}
> Great! However, it's not finding my  log4j.properties for some reason. This is annoying
because hdfs isn't printing anything or logging anywhere. Where is it looking?
> {code}
> $ strace -f hdfs namenode -conf $(pwd)/conf/nn/hdfs-site.xml 2>&1 | grep log4j.properties
> stat("/home/ehigg90120/src/hadoop-run/tutorial/conf/nn/log4j.properties", {st_mode=S_IFREG|0644,
st_size=13641, ...}) = 0
> {code}
> It found it, but it only statted it. It never opened it! So it seems there's at least
one bug here where {{log4j.properties}} is being ignored. But shouldn't {{HADOOP_OPTS}} be
set and configuring it to print to the console and to my log dir?
> {code}
> $ hdfs --debug namenode -conf $(pwd)/conf/nn/hdfs-site.xml 2>&1 | grep HADOOP_OPTS
> DEBUG: HADOOP_OPTS accepted -Dhdfs.audit.logger=INFO,NullAppender
> DEBUG: Appending HDFS_NAMENODE_OPTS onto HADOOP_OPTS
> DEBUG: HADOOP_OPTS accepted -Dyarn.log.dir=/home/ehigg90120/src/hadoop-run/tutorial/log
> DEBUG: HADOOP_OPTS accepted -Dyarn.log.file=hadoop.log
> DEBUG: HADOOP_OPTS accepted -Dyarn.home.dir=/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT
> DEBUG: HADOOP_OPTS accepted -Dyarn.root.logger=INFO,console
> DEBUG: HADOOP_OPTS accepted -Djava.library.path=/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/lib/native
> DEBUG: HADOOP_OPTS accepted -Dhadoop.log.dir=/home/ehigg90120/src/hadoop-run/tutorial/log
> DEBUG: HADOOP_OPTS accepted -Dhadoop.log.file=hadoop.log
> DEBUG: HADOOP_OPTS accepted -Dhadoop.home.dir=/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT
> DEBUG: HADOOP_OPTS accepted -Dhadoop.id.str=ehigg90120
> DEBUG: HADOOP_OPTS accepted -Dhadoop.root.logger=INFO,console
> DEBUG: HADOOP_OPTS accepted -Dhadoop.policy.file=hadoop-policy.xml
> DEBUG: HADOOP_OPTS declined -Dhadoop.security.logger=INFO,NullAppender
> DEBUG: Final HADOOP_OPTS: -Djava.net.preferIPv4Stack=true -Dhdfs.audit.logger=INFO,NullAppender
-Dhadoop.security.logger=INFO,RFAS -Dyarn.log.dir=/home/ehigg90120/src/hadoop-run/tutorial/log
-Dyarn.log.file=hadoop.log -Dyarn.home.dir=/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT
-Dyarn.root.logger=INFO,console -Djava.library.path=/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT/lib/native
-Dhadoop.log.dir=/home/ehigg90120/src/hadoop-run/tutorial/log -Dhadoop.log.file=hadoop.log
-Dhadoop.home.dir=/home/ehigg90120/src/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha2-SNAPSHOT
-Dhadoop.id.str=ehigg90120 -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml
> {code}
> So it seems it is being configured and passed to the namenode. It's just not obeying
it as far as I can see. 
> So maybe there are two possibly related bugs:
> 1. {{HADOOP_CONF_DIR}} is ignored
> 2. The logger is not using {{log4j.properties}} or the command line. I would expect it
to use the {{log4j.properties}} in the {{HADOOP_CONF_DIR}}.
> I feel like I must be misunderstanding something since this seems like a pretty big issue
but I didn't find any open tickets about it or any tickets describing a new way of configuring
clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message