flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aljoscha Krettek (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1450) Add Fold operator to the Streaming api
Date Thu, 29 Jan 2015 15:01:34 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296968#comment-14296968

Aljoscha Krettek commented on FLINK-1450:

I see the necessity in the streaming API. Also, in the streaming API we don't have to worry
about making this "combinable" the way we can make a GroupReduce operation combinable in the
batch API.

When we want to add this to the batch API we have to think about how we can make this combinable.
I just thought about a similar problem: I want to have an "aggregation" that combines elements
into a list of elements, i.e:

DataSet<Tuple2<String, Integer>> in = ...
DataSet<Tuple2<String, List<Integer>> result = in.groupBy(0).aggregate(1, COLLECT)

The problem here, is that the output of the combiner would be pre-aggregated lists, this would
not match the expected reducer input of single items. I thought about providing a local combiner
as an explicit operation and then the program could be expressed as a local combine and than
a reduce with a different input type. Here, the local combiner would be a fold while the reducer
is a classic shuffled reduce.

I hope I'm making sense to you somehow. :D

> Add Fold operator to the Streaming api
> --------------------------------------
>                 Key: FLINK-1450
>                 URL: https://issues.apache.org/jira/browse/FLINK-1450
>             Project: Flink
>          Issue Type: New Feature
>          Components: Streaming
>    Affects Versions: 0.9
>            Reporter: Gyula Fora
>            Priority: Minor
>              Labels: starter
> The streaming API currently doesn't support a fold operator.
> This operator would work as the foldLeft method in Scala. This would allow effective
implementations in a lot of cases where a the simple reduce is inappropriate due to different
return types.

This message was sent by Atlassian JIRA

View raw message