flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Le Xu <sharonx...@gmail.com>
Subject Re: Local combiner on each mapper in Flink
Date Wed, 25 Oct 2017 21:29:00 GMT
Thank Kurt I'm trying out WindowedStream aggregate right now. Just
wondering, is there any way for me to preserve the window after
aggregation. More specifically, originally i have something like:

WindowedStream<Tuple2<String, Long>, Tuple, TimeWindow> windowStream =
dataStream
                .keyBy(0) //id
                .timeWindow(Time.of(windowSize, TimeUnit.MILLISECONDS))

and then for the reducer I can do:

windowStream.apply(...)

and expect the window information is preserved.

If I were to do use aggregate on window stream, I would end up with
something like:

DataStream<Tuple2<String, Long>> windowStream = dataStream
                .keyBy(0) //id
                .timeWindow(Time.of(windowSize,
TimeUnit.MILLISECONDS)).aggregate
(new AggregateFunction<Tuple2<String, Long>, Accumulator, Tuple2<String,
Long>>() {
                    @Override
                    public Accumulator createAccumulator() {
                        return null;
                    }

                    @Override
                    public void add(Tuple2<String, Long> stringLong,
Accumulator o) {

                    }

                    @Override
                    public Tuple2<String, Long> getResult(Accumulator o) {
                        return null;
                    }

                    @Override
                    public Accumulator merge(Accumulator o, Accumulator
acc1) {
                        return null;
                    }
                });

Because it looks like aggregate would only transfer WindowedStream to a
DataStream. But for a global aggregation phase (a reducer), should I
extract the window again?


Thanks! I apologize if that sounds like a very intuitive questions.


Le






On Tue, Oct 24, 2017 at 4:14 AM, Kurt Young <ykt836@gmail.com> wrote:

> I think you can use WindowedStream.aggreate
>
> Best,
> Kurt
>
> On Tue, Oct 24, 2017 at 1:45 PM, Le Xu <sharonxu65@gmail.com> wrote:
>
>> Thanks Kurt. Maybe I wasn't clear before, I was wondering if Flink has
>> implementation of combiner in DataStream (to use after keyBy and windowing).
>>
>> Thanks again!
>>
>> Le
>>
>> On Sun, Oct 22, 2017 at 8:52 PM, Kurt Young <ykt836@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> The document you are looking at is pretty old, you can check the newest
>>> version here: https://ci.apache.org/projects/flink/flink-docs-releas
>>> e-1.3/dev/batch/dataset_transformations.html
>>>
>>> Regarding to your question, you can use combineGroup
>>>
>>> Best,
>>> Kurt
>>>
>>> On Mon, Oct 23, 2017 at 5:22 AM, Le Xu <sharonxu65@gmail.com> wrote:
>>>
>>>> Hello!
>>>>
>>>> I'm new to Flink and I'm wondering if there is a explicit local
>>>> combiner to each mapper so I can use to perform a local reduce on each
>>>> mapper? I looked up on https://ci.apache.org/proje
>>>> cts/flink/flink-docs-release-0.8/dataset_transformations.html but
>>>> couldn't find anything that matches.
>>>>
>>>>
>>>> Thanks!
>>>>
>>>> Le
>>>>
>>>
>>>
>>
>

Mime
View raw message