beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Shields (JIRA)" <>
Subject [jira] [Commented] (BEAM-175) Leak garbage collection timers in GlobalWindow
Date Wed, 20 Apr 2016 20:53:25 GMT


Mark Shields commented on BEAM-175:

Getting back to this now.

Realized that customers who do simple pubsub-to-bigquery pipelines with dynamic table names
(eg one per day) will also hit this issues as we accumulate timers for each (table spec, random
shard) key in the Reshuffle's GBK.

> Leak garbage collection timers in GlobalWindow
> ----------------------------------------------
>                 Key: BEAM-175
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-core
>            Reporter: Mark Shields
>            Assignee: Mark Shields
> Consider the  transform:
>   Window
>     .into(new GlobalWindows())
>     .triggering(
>       Repeatedly.forever(
>         AfterProcessingTime.pastFirstElementInPane().plusDelayOf(...)))
>     .discardingFiredPanes()
> This is a common idiom for 'process elements bunched by arrival time'.
> Currently we create an end-of-window timer per key, which clearly will only fire if the
pipeline is drained.
> Better would be to avoid creating end-of-window timers if there's no state which needs
to be processed at end-of-window (ie at drain if the Global window).

This message was sent by Atlassian JIRA

View raw message