hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phillips, Caleb" <Caleb.Phill...@nrel.gov>
Subject Re: fs.s3a.endpoint not working
Date Tue, 16 Feb 2016 22:08:13 GMT
Hi All,

Just wanted to follow on that we got this working with the help of the object storage vendor.
After running in circles for a bit, the issue seems to have been as simple as using the correct
FQDN in the endpoint fields and disabling SSL. We used the jet3st properties, but it turns
out those aren’t actually needed with recent Hadoop versions (?).

For anyone who might be having similar issues, here are the relevant configuration in core-site.xml
for S3A and S3N with Hadoop 2.7.1:

<configuration>

<!-- S3N Connector to Obsidian -->
<property>
 <name>fs.s3n.awsAccessKeyId</name>
 <description>AWS access key ID</description>
 <value>yourusername</value>
</property>

<property>
 <name>fs.s3n.awsSecretAccessKey</name>
 <description>AWS secret key</description>
 <value>sweetpassword</value>
</property>

<property>
 <name>fs.s3n.endpoint</name>
 <value>youre.fqdn.here</value>
</property>

<property>
 <name>fs.s3n.ssl.enabled</name>
 <value>false</value>
</property>

<!-- S3A Connector to Obsidian -->

<property>
 <name>fs.s3a.access.key</name>
 <description>AWS access key ID. Omit for Role-based authentication.</description>
 <value>yourusername</value>
</property>

<property>
 <name>fs.s3a.secret.key</name>
 <description>AWS secret key. Omit for Role-based authentication.</description>
 <value>sweetpassword</value>
</property>

<property>
 <name>fs.s3a.connection.ssl.enabled</name>
 <value>false</value>
 <description>Enables or disables SSL connections to S3.</description>
</property>

<property>
 <name>fs.s3a.endpoint</name>
 <description>AWS S3 endpoint to connect to. An up-to-date list is
    provided in the AWS Documentation: regions and endpoints. Without this
    property, the standard region (s3.amazonaws.com) is assumed.
 </description>
 <value>your.fqdn.here</value>
</property>

</configuration>

Also, as mentioned previously in the thread, it’s necessary to add some things to your HADOOP_CLASSPATH:

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/path/to/hadoop-2.7.1/share/hadoop/tools/lib/*

You can test by:

s3cmd mb s3://some-bucket               # <- note that you have to do this with s3cmd,
not hadoop, at least with our object store
hadoop fs -ls s3n://some-bucket/
hadoop fs -ls s3a://some-bucket/
hadoop distcp /your/favorite/hdfs/data s3a://some-bucket/

HTH,

--
Caleb Phillips, Ph.D.
Data Scientist | Computational Science Center

National Renewable Energy Laboratory (NREL)
15013 Denver West Parkway | Golden, CO 80401
303-275-4297 | caleb.phillips@nrel.gov

From: Billy Watson <williamrwatson@gmail.com<mailto:williamrwatson@gmail.com>>
Date: Tuesday, January 19, 2016 at 8:41 AM
To: Alexander Pivovarov <apivovarov@gmail.com<mailto:apivovarov@gmail.com>>
Cc: Caleb Phillips <caleb.phillips@nrel.gov<mailto:caleb.phillips@nrel.gov>>,
"user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Re: fs.s3a.endpoint not working

Stupid question, I assume you're using a URL that starts with s3a and that your custom endpoint
supports s3a?

William Watson
Lead Software Engineer

On Thu, Jan 14, 2016 at 1:57 PM, Alexander Pivovarov <apivovarov@gmail.com<mailto:apivovarov@gmail.com>>
wrote:

http://www.jets3t.org/toolkit/configuration.html

On Jan 14, 2016 10:56 AM, "Alexander Pivovarov" <apivovarov@gmail.com<mailto:apivovarov@gmail.com>>
wrote:

Add jets3t.properties file with s3service.s3-endpoint=<endpoint> to /etc/hadoop/conf
folder

The folder with the file should be in HADOOP_CLASSPATH

JetS3t library which is used by hadoop is looking for this file.

On Dec 22, 2015 12:39 PM, "Phillips, Caleb" <Caleb.Phillips@nrel.gov<mailto:Caleb.Phillips@nrel.gov>>
wrote:
Hi All,

New to this list. Looking for a bit of help:

I'm having trouble connecting Hadoop to a S3-compatable (non AWS) object store.

This issue was discussed, but left unresolved, in this thread:

https://mail-archives.apache.org/mod_mbox/spark-user/201507.mbox/%3CCA+0W_Au5Es_fLUgZMGwkkgA3JyA1ASi3u+isJCuYmfnTvNkGuQ@mail.gmail.com%3E

And here, on Cloudera's forums (the second post is mine):

https://community.cloudera.com/t5/Data-Ingestion-Integration/fs-s3a-endpoint-ignored-in-hdfs-site-xml/m-p/33694#M1180

I'm running Hadoop 2.6.3 with Java 1.8 (65) on a Linux host. Using Hadoop, I'm able to connect
to S3 on AWS, and e.g., list/put/get files.

However, when I point the fs.s3a.endpoint configuration directive at my non-AWS S3-Compatable
object storage, it appears to still point at (and authenticate against) AWS.

I've checked and double-checked my credentials and configuration using both Python's boto
library and the s3cmd tool, both of which connect to this non-AWS data store just fine.

Any help would be much appreciated. Thanks!

--
Caleb Phillips, Ph.D.
Data Scientist | Computational Science Center

National Renewable Energy Laboratory (NREL)
15013 Denver West Parkway | Golden, CO 80401
303-275-4297<tel:303-275-4297> | caleb.phillips@nrel.gov<mailto:caleb.phillips@nrel.gov>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org<mailto:user-unsubscribe@hadoop.apache.org>
For additional commands, e-mail: user-help@hadoop.apache.org<mailto:user-help@hadoop.apache.org>


Mime
View raw message