flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gyula Fora <gyf...@apache.org>
Subject Re: Streaming groupby and aggregation by field expressions
Date Wed, 05 Nov 2014 18:40:21 GMT
Hi Fabian,

The link you sent is broken but I think you are referring the conversation with Viktor.

I agree that we should provide the same api for aggregations and I don’t think there is
any reason why the proposed batch approach wouldn’t work on streaming :)

When there is an agreement on the new aggregation api we will implement it for streaming.

Cheers,
Gyula

> On 05 Nov 2014, at 19:26, Fabian Hueske <fhueske@apache.org> wrote:
> 
> Hi Gyula,
> 
> great to see so much progress with Flink Streaming!
> 
> Regarding the improved aggregations, there is another effort to improve the
> aggregations of the batch API, which was recently discussed on the mailing
> list [1].
> I think it would make sense to try to keep the streaming and batch APIs
> close together. Would the proposed batch approach also work for the
> streaming case and if not what is missing?
> 
> Cheers, Fabian
> 
> [1]
> http://mail-archives.apache.org/mod_mbox/flink-dev/201410.mbox/%3C44B1AB07-F993-436F-AE23-8CC4CCC08A54%40tu-berlin.de%3E
> 
> 2014-11-05 18:37 GMT+01:00 Gyula Fóra <gyfora@apache.org>:
> 
>> Hey guys,
>> 
>> Just a quick note on some upcoming API updates for the Streaming api.
>> 
>> Now it will be possible to use field expressions for both grouping and
>> aggregations in the streaming api. You can check it out here
>> <
>> https://github.com/mbalassi/incubator-flink/blob/daba36e142537ca0bd7e4d0f1209ce8b0ebecda5/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/wordcount/PojoWordCount.java#L102
>>> 
>> .
>> 
>> Or in a concise form:
>> DataStream<Word> counts = text.flatMap(new Tokenizer()).groupBy("word")
>> .sum("frequency");
>> 
>> I will still do some more testing before it will be available in the master
>> branch.
>> 
>> I am also planning to extend aggregations to more fields at the same time
>> like
>> sum(1,2,2) or max("a","c").
>> 
>> Regards,
>> Gyula
>> 


Mime
View raw message