hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@effectivemachines.com>
Subject Re: hadoop 3 scripts & classpath setup
Date Tue, 22 Aug 2017 16:24:56 GMT

> On Aug 22, 2017, at 6:00 AM, Steve Loughran <stevel@hortonworks.com> wrote:
> 
> 
> I'm having problems getting the s3 classpath setup on the CLI & am trying to work
out what I'm doing wrong.
> 
> 
> without setting things up, you can't expect to talk to blobstores
> 
> hadoop fs -ls wasb://something/
> hadoop fs -ls s3a://landsat-pds/
> 
> That's expected.

	Yup.

> but what I can't do is get the aws bits on the CP via HADOOP_OPTIONAL_TOOLS
> 
> export HADOOP_OPTIONAL_TOOLS="hadoop-azure,hadoop-aws,hadoop-adl,hadoop-openstack"
> 
> Once I do that the wasb:// ls works (or at least doesnt throw a CNFE), but the s3a URL
still fails

	Hmm. So HOT is getting processed at least somewhat then...

> if Add the line to ~/.hadooprc all becomes well
> 
> hadoop_add_to_classpath_tools hadoop-aws
> 
> any ideas?

	Setting HOT should be calling the equivalent of hadoop_add_to_classpath_tools hadoop-aws
in the code path.  Luckily, we have debugging tools in 3.x[1]:

First, let’s duplicate the failure conditions, but only activate hadoop-aws since it should
be standalone and cuts our output down:

=======================
$ cat ~/.hadooprc
cat: /Users/aw/.hadooprc: No such file or directory
$ bin/hadoop envvars | grep CONF
HADOOP_CONF_DIR='/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/etc/hadoop'
$ pwd
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT
$ grep OPTIONAL_TOOLS etc/hadoop/hadoop-env.sh
# export HADOOP_OPTIONAL_TOOLS="hadoop-aliyun,hadoop-aws,hadoop-azure,hadoop-azure-datalake,hadoop-kafka,hadoop-openstack"
export HADOOP_OPTIONAL_TOOLS="hadoop-aws”
=======================

Using --debug, let’s see what happens:

=======================
$ bin/hadoop --debug classpath 2>&1 | egrep '(tools|hadoop-aws)'
DEBUG: shellprofiles: /Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-aliyun.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-archive-logs.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-archives.sh /Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-aws.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-azure-datalake.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-azure.sh /Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-distcp.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-extras.sh /Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-gridmix.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-hdfs.sh /Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-httpfs.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-kafka.sh /Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-kms.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-mapreduce.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-openstack.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-rumen.sh /Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-streaming.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-yarn.sh
DEBUG: Adding hadoop-aws to HADOOP_TOOLS_OPTIONS
DEBUG: Profiles: importing /Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-aws.sh
DEBUG: HADOOP_SHELL_PROFILES accepted hadoop-aws
DEBUG: Profiles: hadoop-aws classpath
DEBUG: Append CLASSPATH: /Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/aws-java-sdk-bundle-1.11.134.jar
DEBUG: Append CLASSPATH: /Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/java-xmlbuilder-0.4.jar
DEBUG: Append CLASSPATH: /Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/jets3t-0.9.0.jar
DEBUG: Append CLASSPATH: /Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/hadoop-aws-3.0.0-beta1-SNAPSHOT.jar
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/etc/hadoop:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/common/lib/*:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/common/*:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/aws-java-sdk-bundle-1.11.134.jar:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/java-xmlbuilder-0.4.jar:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/jets3t-0.9.0.jar:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/hadoop-aws-3.0.0-beta1-SNAPSHOT.jar:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/hdfs:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/hdfs/lib/*:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/hdfs/*:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/mapreduce/*:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/yarn/lib/*:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/yarn/*
=======================

OK, the “extra” bits are definitely getting added.  With the addition of the debug lines:
* the hadoop-aws profile and tools hooks are getting executed
* the hadoop-aws classpath function is getting executed (aka hadoop_add_to_classpath_tools
hadoop-aws)
* the classpath isn’t rejecting any jars
* the final line definitely has AWS there.

So we should be good to go assuming the profile and supplemental tools code is correct.

=======================
$ bin/hadoop fs -ls s3a://landsat-pds/
ls: Interrupted
=======================

umm, ok?  No CNFE though.  If I disable the network:

=======================
$ bin/hadoop fs -ls s3a://landsat-pds/
ls: doesBucketExist on landsat-pds: com.amazonaws.AmazonClientException: No AWS Credentials
provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider
: com.amazonaws.SdkClientException: Unable to load credentials from service endpoint: No AWS
Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider
InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials
from service endpoint
=======================

Ugly error, but still no CNFE. So at least out of the box with a build from last week. I guess
this is working?  At this point, it’d probably be worthwhile to make sure that the libexec/shellprofile.d/hadoop-aws.sh
on your system is in working order. In particular...

=======================
if hadoop_verify_entry HADOOP_TOOLS_OPTIONS "hadoop-aws"; then
  hadoop_add_profile "hadoop-aws”
fi
=======================

… is the magic code.  It (effectively[2]) says that if HADOOP_OPTIONAL_TOOLS has hadoop-aws
in it, then activate the hadoop-aws profile which should end up calling hadoop_add_to_classpath_tools
hadoop-aws.   Might also be worthwhile to check simple stuff like permissions.

[1] It’s tempting to say “now”, but given that debug was added several years ago. it’s
more like branch-2 is just really ancient rather than 3.x being "current".

[2]  yes, that variable is supposed to be HADOOP_TOOLS_OPTIONS.  HOT gets transformed into
HADOOP_OPTIONAL_TOOLS  internally for “reasons”.  It’s a longer discussion that most
people aren’t interested in.



---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org


Mime
View raw message