flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fabian Hueske (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-6649) Improve Non-window group aggregate with configurable `earlyFire`.
Date Sat, 20 May 2017 21:04:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018621#comment-16018621
] 

Fabian Hueske commented on FLINK-6649:
--------------------------------------

I agree, that we need to provide a mechanism to reduce the number of rows emitted by non-windowed
aggregate.

However, I have two concerns about the proposed method:

1. IMO, the mechanism should not be called {{Early Fire}}. In my understanding, early firing
refers to enabling the emission of incomplete results from a certain point in time. For example
emitting early results for an hourly tumbling window after the first 15 minutes have passed.
In this example, {{Early Fire = -45 Minutes}} would specify that the first results for the
window may be emitted after 15 minutes (60 minutes - 45 minutes). By definition, a non-windowed
aggregation is never complete. Whenever a new record for the group arrives it has to be added
to the result. So non-windowed aggregates are always early firing, because the result never
completes and we would otherwise never emit a result. Instead of {{Early Fire}}, I would call
this mechanism {{Update Rate}} to specify how often a result is emitted.
2. The {{Update Rate}} may not be defined in terms of record count, but only in terms of time.
Given an non-windowed grouped aggregate with {{Update Rate = 10 Rows}}, we would never emit
a result for a group that only received 9 rows. By defining the update rate as a time interval,
we can effectively reduce the number of outgoing records and ensure that updates are eventually
propagated.

What do you think [~sunjincheng121]?

> Improve Non-window group aggregate with configurable `earlyFire`.
> -----------------------------------------------------------------
>
>                 Key: FLINK-6649
>                 URL: https://issues.apache.org/jira/browse/FLINK-6649
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>    Affects Versions: 1.4.0
>            Reporter: sunjincheng
>            Assignee: sunjincheng
>
> Currently,  Non-windowed group aggregate is earlyFiring at count(1), that is every row
will emit a aggregate result. But some times user want config count number (`early firing
with count[N]`) , to reduce the downstream pressure. This JIRA. will enable the config of
e`earlyFiring` for  Non-windowed group aggregate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message