hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phillips, Caleb" <Caleb.Phill...@nrel.gov>
Subject Re: fs.s3a.endpoint not working
Date Thu, 14 Jan 2016 16:32:08 GMT
Hi Jonathan,


>Im not totally following this thread from the beginning but I might be
>able to help as I have some experience with Amazon EMR (elastic map
>reduce) when working with custom jar files and s3
>Are you using EMR or something internal and offloading strage to s3?

We have an S3-compatable object store made by Scality
(http://www.scality.com/). A so-called ‘ring’. It’s basically a pile of
linux boxes that behaves like our own internal S3 ‘cloud'. It lives in our
data-center.

What I’d like to do is have hadoop connect to that *instead* of the Amazon
AWS S3.

Yet, no matter how I set the fs.s3a.endpoint directive, it still connects
to Amazon’s S3.

Hope that clarifies,

—
Caleb

>---
>Regards,
>Jonathan Aquilina
>Founder
>
>
> 
>On 2016-01-13 23:21, Phillips, Caleb wrote:
>
>Hi Billy (and others),
>
>One of the threads suggested using the core-site.xml. Did you try putting
>your configuration in there?
>
>Yes, I did try that. I've also tried setting it dynamically in e.g.,
>spark. I can verify that it is getting the configuration correctly:
>
>hadoop org.apache.hadoop.conf.Configuration
>
>Still it never connects to our internal S3-compatable store and always
>connects to AWS.
>
>One thing I've noticed is that the AWS stuff is handled by an underlying
>library (I think jets3t in < 2.6 versions, forget what in 2.6+) and when
>I was trying to mess with stuff and spelunking through the hadoop code, I
>kept running into blocks with that library.
>
>I started digging into the code. I found that the custom endpoint was
>introduced with this patch:
>
>https://issues.apache.org/jira/browse/HADOOP-11261
>
>It seems it was integrated in 2.7.0, so just to be sure I downloaded
>2.7.1, but the problem persists.
>
>That code calls this function in the AWS Java SDK:
>
>http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/service
>s/s3/AmazonS3Client.html#setEndpoint(java.lang.String)
>
>However, no matter what configuration I use, it still seems to connect to
>Amazon AWS. Is it possible that the AWS Java SDK cannot work with
>S3-compatable (non-AWS) stores? If so, it would seem there is no way
>currently to connect hadoop to an S3-compatable
> non-AWS store.
>
>If anyone else has any insight, particularly success using hadoop with a
>non-AWS, S3-compatable store, please chime in!
>
>
>William Watson
>Software Engineer
>(904) 705-7056 PCS
>
>On Mon, Jan 11, 2016 at 10:39 AM, Phillips, Caleb
><Caleb.Phillips@nrel.gov<mailto:Caleb.Phillips@nrel.gov>> wrote:
>Hi All,
>
>Just wanted to send this out again since there was no response
>(admittedly, originally sent in the midst of the US holiday season) and it
>seems to be an issue that continues to come up (see e.g., the email from
>Han Ju on Jan 5).
>
>If anyone has successfully connected Hadoop to a non-AWS S3-compatable
>object store, it'd be very helpful to hear how you made it work. The
>fs.s3a.endpoint configuration directive appears non-functional at our site
>(with Hadoop 2.6.3).
>
>--
>Caleb Phillips, Ph.D.
>Data Scientist | Computational Science Center
>
>National Renewable Energy Laboratory (NREL)
>15013 Denver West Parkway | Golden, CO 80401
>303-275-4297<tel:303-275-4297> |
>caleb.phillips@nrel.gov<mailto:caleb.phillips@nrel.gov>
>
>
>
>
>
>
>On 12/22/15, 1:39 PM, "Phillips, Caleb"
><Caleb.Phillips@nrel.gov<mailto:Caleb.Phillips@nrel.gov>> wrote:
>
>
>Hi All,
>
>New to this list. Looking for a bit of help:
>
>I'm having trouble connecting Hadoop to a S3-compatable (non AWS) object
>store.
>
>This issue was discussed, but left unresolved, in this thread:
>
>https://mail-archives.apache.org/mod_mbox/spark-user/201507.mbox/%3CCA+0W_
>Au5Es_fLUgZMGwkkgA3JyA1ASi3u+isJCuYmfnTvNkGuQ@mail.gmail.com<mailto:Au5Es_
>fLUgZMGwkkgA3JyA1ASi3u%2BisJCuYmfnTvNkGuQ@mail.gmail.com>%3E
>
>And here, on Cloudera's forums (the second post is mine):
>
>https://community.cloudera.com/t5/Data-Ingestion-Integration/fs-s3a-endpoi
>nt-ignored-in-hdfs-site-xml/m-p/33694#M1180
>
>I'm running Hadoop 2.6.3 with Java 1.8 (65) on a Linux host. Using
>Hadoop, I'm able to connect to S3 on AWS, and e.g., list/put/get files.
>
>However, when I point the fs.s3a.endpoint configuration directive at my
>non-AWS S3-Compatable object storage, it appears to still point at (and
>authenticate against) AWS.
>
>I've checked and double-checked my credentials and configuration using
>both Python's boto library and the s3cmd tool, both of which connect to
>this non-AWS data store just fine.
>
>Any help would be much appreciated. Thanks!
>
>--
>Caleb Phillips, Ph.D.
>Data Scientist | Computational Science Center
>
>National Renewable Energy Laboratory (NREL)
>15013 Denver West Parkway | Golden, CO 80401
>303-275-4297 | caleb.phillips@nrel.gov<mailto:caleb.phillips@nrel.gov>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail:
>user-unsubscribe@hadoop.apache.org<mailto:user-unsubscribe@hadoop.apache.o
>rg>
>For additional commands, e-mail:
>user-help@hadoop.apache.org<mailto:user-help@hadoop.apache.org>
>
>
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>For additional commands, e-mail: user-help@hadoop.apache.org

Mime
View raw message