flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@apache.org>
Subject Re: long runtime
Date Wed, 24 Sep 2014 20:45:43 GMT
Hi,

how did you specify the degree of parallelism DOP for your program?
Via the command-line client or system-configuration or otherwise?

The JobManager log file (./log/*jobManager*.log) contains you the DOP of
each task.

Best, Fabian

2014-09-24 18:41 GMT+02:00 Stephan Ewen <sewen@apache.org>:

> Hi!
>
> Ad-hoc, that is not easy to say. It depends on your algorithm, how much
> data replication it does...
>
> We'd need a bit of time to look into the code. It would help if you could
> roughly sketch the algorithm for us and give us a breakdown of how much
> time is spent in which operator (like a screenshot of the runtime web
> monitor).
>
> Greetings,
> Stephan
>
>
> On Wed, Sep 24, 2014 at 6:18 PM, Florian Hönicke <rockstarflo@gmail.com>
> wrote:
>
>> Hello :)
>>
>> my Flink program is extreme slow.
>> I implemented a set similarity join in Flink (Mass-Join).
>> Furthermore, I implemented a local version in Java.
>> I compared both Implementations.
>> The Local version needs one minute to compute a 500MB Dataset.
>> My Flink program needs 5 minutes (cluster: 11 nodes, 20 000 MB RAM).
>> I use the Flink version 0.6.
>> What could be the cause?
>>
>> I would welcome your response,
>> Florian Hönicke
>>
>
>

Mime
View raw message