beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Levy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-2715) Expose PubsubSource to create UnboundedSource and utilize withMaxNumRecords from BoundedReadFromUnboundedSource
Date Wed, 16 Aug 2017 21:13:00 GMT

    [ https://issues.apache.org/jira/browse/BEAM-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129425#comment-16129425
] 

Adam Levy commented on BEAM-2715:
---------------------------------

In the previous version of Beam we would use maxNumRecords with DirectRunner to do end-to-end
testing locally with a small sample size of production data. As for why we cannot just run
a streaming pipeline it is because we are dealing with extremely high volume data so an unbounded
source running locally quickly causes an OutOfMemory Exception. As for TestStream, it does
not use PubSub at all so it is not useful for end-to-end testing.

> Expose PubsubSource to create UnboundedSource and utilize withMaxNumRecords from BoundedReadFromUnboundedSource
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-2715
>                 URL: https://issues.apache.org/jira/browse/BEAM-2715
>             Project: Beam
>          Issue Type: New Feature
>          Components: runner-direct
>            Reporter: Adam Levy
>            Assignee: Thomas Groh
>              Labels: pubsub
>
> We are ingesting from a Pubsub Read and are attempting to mimic the maxNumRecords that
was available in 0.6.0. In order to do this we would need to utilize withMaxNumRecords from
the BoundedReadFromUnboundedSource class. We would need to utilize the PubsubSource class
to create the UnboundedSource from Pubsub. Would it be possible to expose PubsubSource? Currently
what is the recommended way to create a bounded read from Pubsub with a set number of records?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message