flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Sequential/ordered map
Date Thu, 05 Jan 2017 15:18:56 GMT
Please avoid collecting the data to the client using collect(). This
operation looks convenient but is only meant for super small data and would
be a lot slower and less robust even if it would work for large data sets.
Rather set the parallelism of the operator to 1.

Fabian

2017-01-05 13:18 GMT+01:00 Sebastian Neef <gehaxelt@mailbox.tu-berlin.de>:

> Hi Chesnay,
>
> thanks for the input. Finding a word's first occurrence is part of the
> algorithm.
>
> To be exact I'm trying to implement Adler's Text authorship tracking in
> flink (http://www2007.org/papers/paper692.pdf, page 266).
>
> Thanks,
> Sebastian
>

Mime
View raw message