flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-6373) Add runtime support for distinct aggregation over grouped windows
Date Fri, 28 Apr 2017 07:51:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988376#comment-15988376

ASF GitHub Bot commented on FLINK-6373:

Github user fhueske commented on the issue:

    Hi @haohui, 
    I suggested before that PR #3771 might be used for DISTINCT group window functions. However,
this does not work because we cannot register state for an AggregateFunction. The benefit
of the approach of #3771 would have been that it does not need to deserialize the Map every
time a record is accumulated (or retracted). Instead the distinct values are kept in a MapState
that can be accessed (and deserialized) per look up key. But this approach does not work with
the AggregateFunction that we use for early aggregation. 
    To be honest, I'm a bit concerned about the performance of the approach of this PR because
the  state of the DistinctAccumulator accumulator (i.e., the complete map) will be de/serialized
every time we access it. 
    I think we can use this approach for now, but should look out, whether we can use an approach
similar to the batch side where distinct aggregations (on different keys) are translated into
multiple aggregations which are later joined together (the join would be rather cheap because
its a 1-to-1 join).
    I'll have a look at this PR later today.
    Thanks, Fabian

> Add runtime support for distinct aggregation over grouped windows
> -----------------------------------------------------------------
>                 Key: FLINK-6373
>                 URL: https://issues.apache.org/jira/browse/FLINK-6373
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API & SQL
>            Reporter: Haohui Mai
>            Assignee: Haohui Mai
> This is a follow up task for FLINK-6335. FLINK-6335 enables parsing the distinct aggregations
over grouped windows. This jira tracks the effort of adding runtime support for the query.

This message was sent by Atlassian JIRA

View raw message