flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Execution graph
Date Tue, 30 Jun 2015 08:41:21 GMT
As an addition, some operators can only be run with a parallelism of 1. For
example data sources based on collections and (un-grouped) all reduces. In
some cases, the parallelism of the following operators will as well be set
to 1 to avoid a network shuffle.

If you do:

env.fromCollection(myCollection).map(new MyMapper()).groupBy(0).reduce(new
MyReduce()).writeToFile();

the data source and mapper will be run with a parallelism of 1, the reducer
and sink will be executed with the default parallelism.

Best, Fabian

2015-06-30 10:25 GMT+02:00 Maximilian Michels <mxm@apache.org>:

> Hi Michele,
>
> If you don't set the parallelism, the default parallelism is used. For the
> visualization in the web client, a parallelism of one is used. When you run
> your example from your IDE, the default parallelism is set to the number of
> (virtual) cores of your CPU.
>
> Moreover, Flink will currently not automatically set the parallelism in a
> cluster environment. It will use the default parallelism or the user-set
> parallelism. In your example, if you set the parallelism explicitly then it
> will also show up in the visualization.
>
> Best,
> Max
>
> On Tue, Jun 30, 2015 at 7:11 AM, Michele Bertoni <
> michele1.bertoni@mail.polimi.it> wrote:
>
>> Hi, I was trying to run my program in the flink web environment (the
>> local one)
>> when I run it I get the graph of the planned execution but in each node
>> there is a "parallelism = 1”, instead i think it runs with par = 8 (8 core,
>> i  always get 8 output)
>>
>> what does that mean?
>> is that wrong or is it really running with 1 degree of par?
>>
>> just a note: I never do any setParallelism() command, i leave it
>> automatical
>>
>> thanks
>> Best
>> Michele
>
>
>

Mime
View raw message