flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Stream conversion
Date Thu, 04 Feb 2016 10:33:18 GMT

If I understand you correctly, what you are looking for is a kind of
periodic batch job, where the input data for each batch is a large window.

We have actually thought about this kind of application before. It is not
on the short term road map that we shared a few weeks ago, but I think it
will come to Flink in the mid-term (that would be in some months or so), it
is asked for quite frequently.

Implementing this as a core feature is a bit of effort. A mock that writes
out the windows and triggers a batch job sounds not too difficult, actually.


On Thu, Feb 4, 2016 at 10:30 AM, Sane Lee <leesane8@gmail.com> wrote:

> I have also, similar scenario. Any suggestion would be appreciated.
> On Thu, Feb 4, 2016 at 10:29 AM Jeyhun Karimov <je.karimov@gmail.com>
> wrote:
>> Hi Matthias,
>> This need not to be necessarily in api functions. I just want to get a
>> roadmap to add this functionality. Should I save each window's data into
>> disk and create a new dataset environment in parallel? Or change trigger
>> functionality maybe?
>> I have large windows. As I asked in previous question, in flink the
>> problem with large windows (that data inside windows may not fit in memory)
>> will be solved. So, after getting the data of window, I want to do more
>> than the functions in stream api. Therefore I need to convert it to
>> dataset. Any roadmap would be appreciated.
>> On Thu, Feb 4, 2016 at 10:23 AM Matthias J. Sax <mjsax@apache.org> wrote:
>>> Hi Sane,
>>> Currently, DataSet and DataStream API a strictly separated. Thus, this
>>> is not possible at the moment.
>>> What kind of operation do you want to perform on the data of a window?
>>> Why do you want to convert the data into a data set?
>>> -Matthias
>>> On 02/04/2016 10:11 AM, Sane Lee wrote:
>>> > Dear all,
>>> >
>>> > I want to convert the data from each window of stream to dataset. What
>>> > is the best way to do that?  So, while streaming, at the end of each
>>> > window I want to convert those data to dataset and possible apply
>>> > dataset transformations to it.
>>> > Any suggestions?
>>> >
>>> > -best
>>> > -sane

View raw message