flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ayush Verma <ayushver...@gmail.com>
Subject Re: Using S3 as a sink (StreamingFileSink)
Date Sun, 18 Aug 2019 17:06:30 GMT
I would suggest you upgrade flink to 1.7.x and flink-s3-fs-hadoop to 1.7.2.
You might be facing this issue:

   - https://issues.apache.org/jira/browse/FLINK-11496
   - https://issues.apache.org/jira/browse/FLINK-11302

Kind regards
Ayush Verma

On Sun, Aug 18, 2019 at 6:02 PM taher koitawala <taherk77@gmail.com> wrote:

> We used EMR version 5.20 which has Flink 1.6.2 and all other libraries
> were according to this version. So flink-s3-fs-hadoop was 1.6.2 as well.
> On Sun, Aug 18, 2019, 9:55 PM Ayush Verma <ayushverma5@gmail.com> wrote:
>> Hello, could you tell us the version of flink-s3-fs-hadoop library that
>> you are using ?
>> On Sun 18 Aug 2019 at 16:24, taher koitawala <taherk77@gmail.com> wrote:
>>> Hi Swapnil,
>>>        We faced this problem once, I think changing checkpoint dir to
>>> hdfs and keeping sink dir to s3 with EMRFS s3 consistency enabled solves
>>> this problem. If you are not using emr then I don't know how else it can be
>>> solved. But in a nutshell because EMRFS s3 consistency uses Dynamo DB in
>>> the back end to check for all files being written to s3. It kind of makes
>>> s3 consistent and Streaming file sink works just fine.
>>> On Sat, Aug 17, 2019, 3:32 AM Swapnil Kumar <swkumar@zendesk.com> wrote:
>>>> Hello, We are using Flink to process input events and aggregate and
>>>> write o/p of our streaming job to S3 using StreamingFileSink but whenever
>>>> we try to restore the job from a savepoint, the restoration fails with
>>>> missing part files error. As per my understanding, s3 deletes those
>>>> part(intermittent) files and can no longer be found on s3. Is there a
>>>> workaround for this, so that we can use s3 as a sink?
>>>> --
>>>> Thanks,
>>>> Swapnil Kumar

View raw message