flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aljoscha Krettek (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2550) Rework DataStream API
Date Thu, 20 Aug 2015 12:20:45 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704756#comment-14704756

Aljoscha Krettek commented on FLINK-2550:

One possibility I see is that the result DataStream of a window operation is of type {{DataStream<WindowResult<T,
W>>}} (where {{W}} is a type that depends on the window being used) instead of {{DataStream<T>}}.
This would allow downstream operations to retrieve the meta-information for the window:

DataStream input = ...
DataStram result = input
  .window(10 Sec).every(2 sec)
  .map( (WindowResult<TupleX, TimeWindow> r) -> "got aggregate r.result in time window
r.window" )

Another option would be to extends {{StreamRecord}} to also hold information about the window.
This would, however, mean that we introduce overhead in every element sent, not just window
result elements.

> Rework DataStream API
> ---------------------
>                 Key: FLINK-2550
>                 URL: https://issues.apache.org/jira/browse/FLINK-2550
>             Project: Flink
>          Issue Type: Improvement
>          Components: Streaming
>    Affects Versions: 0.9
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>             Fix For: 0.10
> After discussions on the mailing list we arrived at a consensus to rework the streaming
API to make it more fool-proof and easier to use. The resulting design document is available
here: https://cwiki.apache.org/confluence/display/FLINK/Streams+and+Operations+on+Streams

This message was sent by Atlassian JIRA

View raw message