flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maximilian Michels <...@apache.org>
Subject Re: Dealing with Multiple sinks in Flink
Date Wed, 24 Aug 2016 10:49:27 GMT
Hi Vinay,

Does this only happen with the S3 file system or also with your local
file system? Could you share some example code or log output of your
running job?

Best,
Max

On Wed, Aug 24, 2016 at 4:20 AM, Vinay Patil <vinay18.patil@gmail.com> wrote:
> Hi,
>
> In our flink pipeline we are currently writing the data to multiple S3
> objects/folders based on some conditions, so the issue I am facing is as
> follows :
>
> Consider these S3 folders :
> temp_bucket/processedData/20160823/
> temp_bucket/rawData/20160822/
> temp_bucket/errorData/20160821/
>
> Now when the parallelism is set to 1, the data gets written to all S3
> folders above, but when I set it to larger value the data is written only to
> the first folder and not the others.
>
> I am testing the flink job on EMR with 4 task managers having 16 slots, even
> if I keep parallelism as 4 , I am facing the same issue.
> (running from IDE is resulting in same output, Tested this with Flink 1.0.3
> and 1.1.1)
>
> I am not understanding why this is happening.
>
>
> Regards,
> Vinay Patil

Mime
View raw message