flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominik Bruhn <domi...@dbruhn.de>
Subject Re: S3 Access in eu-central-1
Date Tue, 28 Nov 2017 21:00:52 GMT
Hey Stephan, Hey Steve,
that was the right hint, adding that open to the Java-Options fixed the 
problem. Maybe we should add this somehow to our Flink Wiki?

Thanks!

Dominik

On 28/11/17 11:55, Stephan Ewen wrote:
> Got a pointer from Steve that this is answered on Stack Overflow here: 
> https://stackoverflow.com/questions/36154484/aws-java-sdk-manually-set-signature-version

> <https://stackoverflow.com/questions/36154484/aws-java-sdk-manually-set-signature-version>
> 
> Flink 1.4 contains a specially bundled "fs-s3-hadoop" with smaller no 
> footprint, compatible across Hadoop versions, and based on a later s3a 
> and AWS sdk. In that connector, it should work out of the box because it 
> uses a later AWS SDK. You can also use it with earlier Hadoop versions 
> because dependencies are relocated, so it should not cash/conflict.
> 
> 
> 
> 
> On Mon, Nov 27, 2017 at 8:58 PM, Stephan Ewen <sewen@apache.org 
> <mailto:sewen@apache.org>> wrote:
> 
>     Hi!
> 
>     The endpoint config entry looks correct.
>     I was looking at this issue to see if there are pointers to anything
>     else, but it looks like the explicit endpoint entry is the most
>     important thing: https://issues.apache.org/jira/browse/HADOOP-13324
>     <https://issues.apache.org/jira/browse/HADOOP-13324>
> 
>     I cc-ed Steve Loughran, who is Hadoop's S3 expert (sorry Steve for
>     pulling you in again - listening and learning still about the subtle
>     bits and pieces of S3).
>     @Steve are S3 V4 endpoints supported in Hadoop 2.7.x already, or
>     only in Hadoop 2.8?
> 
>     Best,
>     Stephan
> 
> 
>     On Mon, Nov 27, 2017 at 9:47 AM, Dominik Bruhn <dominik@dbruhn.de
>     <mailto:dominik@dbruhn.de>> wrote:
> 
>         Hey,
>         can anyone give a hint? Does anyone have flink running with an
>         S3 Bucket in Frankfurt/eu-central-1 and can share his config and
>         setup?
> 
>         Thanks,
>         Dominik
> 
>         On 22. Nov 2017, at 17:52, dominik@dbruhn.de
>         <mailto:dominik@dbruhn.de> wrote:
> 
>>         Hey everyone,
>>         I'm trying since hours to get Flink 1.3.2 (downloaded for
>>         hadoop 2.7) to snapshot/checkpoint to an S3 bucket which is
>>         hosted in the eu-central-1 region. Everything works fine for
>>         other regions. I'm running my job on a JobTracker in local
>>         mode. I googled the internet and found several hints, most of
>>         them telling that setting the `fs.s3a.endpoint` should solve
>>         it. It doesn't. I'm also sure that the core-site.xml (see
>>         below) is picked up, if I put garbage into the endpoint then I
>>         receive a hostname not found error.
>>
>>         The exception I'm getting is:
>>         com.amazonaws.services.s3.model.AmazonS3Exception: Status
>>         Code: 400, AWS Service: Amazon S3, AWS Request ID:
>>         432415098B0994BC, AWS Error Code: null, AWS Error Message: Bad
>>         Request, S3 Extended Request ID:
>>         1PSDe4EOh7zvfNPdWrwoBKKOtsS/gf9atn5movRzcpvIH2WsR+ptXvXyFyEHXjDb3F9AniXgsBQ=
>>
>>         I read the AWS FAQ but I don't think that
>>         https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/aws.html#ioexception-400-bad-request
>>         <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/aws.html#ioexception-400-bad-request>
>>         applies to me as I'm not running the NativeFileSystem.
>>
>>         I suspect this is related to the v4 signing protocol which is
>>         required for S3 in Frankfurt. Could it be that the aws-sdk
>>         version is just too old? I tried to play around with it but
>>         the hadoop adapter is incompatible with newer versions.
>>
>>         I have the following core-site.xml:
>>
>>         <?xml version="1.0"?>
>>         <configuration>
>>          <property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3a.S3AFileSystem</value></property>
>>          <property><name>fs.s3a.buffer.dir</name><value>/tmp</value></property>
>>          <property><name>fs.s3a.access.key</name><value>something</value></property>
>>          <property><name>fs.s3a.secret.key</name><value>wont-tell</value></property>
>>          <property><name>fs.s3a.endpoint</name><value>s3.eu-central-1.amazonaws.com
>>         <http://s3.eu-central-1.amazonaws.com></value></property>
>>         </configuration
>>
>>         Here is my lib folder with the versions of the aws-sdk and the
>>         hadoop-aws integration:
>>         -rw-------    1 root     root       11.4M Mar 20  2014
>>         aws-java-sdk-1.7.4.jar
>>         -rw-r--r--    1 1005     1006       70.0M Aug  3 12:10
>>         flink-dist_2.11-1.3.2.jar
>>         -rw-rw-r--    1 1005     1006       98.3K Aug  3 12:07
>>         flink-python_2.11-1.3.2.jar
>>         -rw-r--r--    1 1005     1006       34.9M Aug  3 11:58
>>         flink-shaded-hadoop2-uber-1.3.2.jar
>>         -rw-------    1 root     root      100.7K Jan 14  2016
>>         hadoop-aws-2.7.2.jar
>>         -rw-------    1 root     root      414.7K May 17  2012
>>         httpclient-4.2.jar
>>         -rw-------    1 root     root      218.0K May  1  2012
>>         httpcore-4.2.jar
>>         -rw-rw-r--    1 1005     1006      478.4K Jul 28 14:50
>>         log4j-1.2.17.jar
>>         -rw-rw-r--    1 1005     1006        8.7K Jul 28 14:50
>>         slf4j-log4j12-1.7.7.jar
>>
>>         Can anyone give me any hints?
>>
>>         Thanks,
>>         Dominik
> 
> 
> 

Mime
View raw message