flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fabian Hueske (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-7799) Improve performance of windowed joins
Date Thu, 02 Aug 2018 13:16:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Fabian Hueske updated FLINK-7799:
---------------------------------
    Priority: Minor  (was: Critical)

> Improve performance of windowed joins
> -------------------------------------
>
>                 Key: FLINK-7799
>                 URL: https://issues.apache.org/jira/browse/FLINK-7799
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API &amp; SQL
>    Affects Versions: 1.4.0
>            Reporter: Fabian Hueske
>            Assignee: Xingcan Cui
>            Priority: Minor
>
> The performance of windowed joins can be improved by changing the state access patterns.
> Right now, rows are inserted into a MapState with their timestamp as key. Since we use
a time resolution of 1ms, this means that the full key space of the state must be iterated
and many map entries must be accessed when joining or evicting rows. 
> A better strategy would be to block the time into larger intervals and register the rows
in their respective interval. Another benefit would be that we can directly access the state
entries because we know exactly which timestamps to look up. Hence, we can limit the state
access to the relevant section during joining and state eviction. 
> The good size for intervals needs to be identified and might depend on the size of the
window.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message