flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Sequential/ordered map
Date Thu, 05 Jan 2017 15:18:56 GMT
Please avoid collecting the data to the client using collect(). This
operation looks convenient but is only meant for super small data and would
be a lot slower and less robust even if it would work for large data sets.
Rather set the parallelism of the operator to 1.


2017-01-05 13:18 GMT+01:00 Sebastian Neef <gehaxelt@mailbox.tu-berlin.de>:

> Hi Chesnay,
> thanks for the input. Finding a word's first occurrence is part of the
> algorithm.
> To be exact I'm trying to implement Adler's Text authorship tracking in
> flink (http://www2007.org/papers/paper692.pdf, page 266).
> Thanks,
> Sebastian

View raw message