flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: Possible use case: Simulating iterative batch processing by rewinding source
Date Mon, 11 Apr 2016 09:15:34 GMT
Flink's DataStream API also allows reading files from disk (local, hdfs,
etc.). So you don't have to set up Kafka to make this work (If you have it
already, you can of course use it).

On Mon, Apr 11, 2016 at 11:08 AM, Ufuk Celebi <uce@apache.org> wrote:

> On Mon, Apr 11, 2016 at 10:26 AM, Raul Kripalani <raulk@apache.org> wrote:
> > Would appreciate the feedback of the community. Even if it's to inform
> that
> > currently this iterative, batch, windowed approach is not possible,
> that's
> > ok!
>
> Hey Raul!
>
> What you describe should work with Flink. This is actually the way to
> go to replace your batch processor with a stream processor ;-). Rewind
> your stream and re-run the streaming job. To get accurate and
> repeatable results you have to work with event time. Otherwise, the
> results for windows will vary from run to run.
>
> The only problems I see concern how to control the different
> iterations of your program. Do the iterations have to proceed one
> after the other, e.g. first finish 1, then start 2, etc.?
>
> – Ufuk
>

Mime
View raw message