flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maximilian Michels <...@apache.org>
Subject Re: Parallelism question
Date Tue, 14 Apr 2015 09:58:24 GMT
Hi Giacomo,

If I understand you correctly, you want your Flink job to execute with a
parallelism of 5. Just call setDegreeOfParallelism(5) on your
ExecutionEnvironment. That way, all operations, when possible, will be
performed using 5 parallel instances. This is also true for the DataSink
which will produce 5 files containing the output data from the parallel
instances.

Best,
Max


On Tue, Apr 14, 2015 at 10:38 AM, Giacomo Licari <giacomo.licari@gmail.com>
wrote:

> Hi guys,
> I have a question about how parallelism works.
>
> If I have a large dataset and I would divide it into 5 blocks, can I pass
> each block of data to a fixed parallel process (for example I set up 5
> process) ?
>
> And if the results data from each process arrive to the output not in an
> ordered way, can I order them? For example:
>
> data from process 1
> data from process 2
> and so on
>
> Thank you guys!
>

Mime
View raw message