flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anton Polyakov <polyakov.an...@gmail.com>
Subject Watermarks as "process completion" flags
Date Sun, 22 Nov 2015 19:53:15 GMT
Hi

I am very new to Flink and in fact never used it. My task (which I currently solve using home
grown Redis-based solution) is quite simple - I have a system which produces some events (trades,
it is a financial system) and computational chain which computes some measure accumulatively
over these events. Those events form a long but finite stream, they are produced as a result
of end of day flow. Computational logic forms a processing DAG which computes some measure
over these events (VaR). Each trade is processed through DAG and at different stages might
produce different set of subsequent events (like return vectors), eventually they all arrive
into some aggregator which computes accumulated measure (reducer).

Ideally I would like to process trades as they appear (i.e. stream them) and once producer
reaches end of portfolio (there will be no more trades), I need to write final resulting measure
and mark it as “end of day record”. Of course I also could use a classical batch - i.e.
wait until all trades are produced and then batch process them, but this will be too inefficient.

If I use Flink, I will need a sort of watermark saying - “done, no more trades” and once
this watermark reaches end of DAG, final measure can be saved. More generally would be cool
to have an indication at the end of DAG telling to which input stream position current measure
corresponds. 

I feel my problem is very typical yet I can’t find any solution. All examples operate either
on infinite streams where nobody cares about completion or classical batch examples which
rely on fact all input data is ready.

Can you please hint me.

Thank you vm
Anton 
Mime
View raw message