beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aviem Zur (JIRA)" <>
Subject [jira] [Assigned] (BEAM-1294) Long running UnboundedSource Readers via Broadcasts
Date Mon, 13 Feb 2017 05:19:41 GMT


Aviem Zur reassigned BEAM-1294:

    Assignee: Aviem Zur  (was: Amit Sela)

> Long running UnboundedSource Readers via Broadcasts
> ---------------------------------------------------
>                 Key: BEAM-1294
>                 URL:
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-spark
>            Reporter: Amit Sela
>            Assignee: Aviem Zur
> When reading from an UnboundedSource, current implementation will cause each split to
create a new Reader every micro-batch.
> As long as the overhead of creating a reader is relatively low, it's reasonable (though
I'd still be happy to get rid of), but in cases where the creation overhead is large it becomes
unreasonable forcing large batches.
> One way to solve this could be to create a pool of lazy-init readers to serve each executor,
maybe via Broadcast variables. 

This message was sent by Atlassian JIRA

View raw message