beam-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Qiu <robi...@google.com>
Subject Re: Pipeline design including (sub / nested)-pipelines
Date Thu, 16 Aug 2018 16:20:06 GMT
Hi Pascal,

As far as I know, you can't create sub-pipeline within a DoFn, i.e. nested
pipelines are not supported.

Best,
Robin

On Thu, Aug 16, 2018 at 7:03 AM Pascal Gula <pascal@plantix.net> wrote:

> As a bonus, here is a simplified diagram view of the use-case:
>
> Cheers,
> Pascal
>
>
> On Thu, Aug 16, 2018 at 3:12 PM, Pascal Gula <pascal@plantix.net> wrote:
>
>> Hello,
>> I am currently evaluating Apache Beam (later executing on Google
>> DataFlow), and for the first use-case I am working on, I have a kinda
>> design question to see if any of you already had a similar one.
>> Namely, we have a DB describing dashboards views, and for each views, we
>> would like to perform some aggregation transform.
>> My first approach would be to create a higher level pipeline that will
>> fetch all view configurations from our mongoDB (BTW, we released a mongoDB
>> IO connector here: https://pypi.org/project/beam-extended/). With this
>> views PColl, the idea is to have a ParDo, with a DoFn that will create
>> sub-pipleine to perform the aggregation on data from our plant database
>> with a qurey derived from the view configuration. Afterwards, the idea is
>> to save for the higher level pipeline, some performance/data metrics
>> related to the execution of the array of sub-pipeline.
>> The main question is: are nested pipeline supported by the runner?
>> I hope that my description was clear enough. I will work on a diagram
>> view meanwhile.
>> Very best regards,
>> Pascal
>>
>> --
>>
>> Pascal Gula
>> Senior Data Engineer / Scientist
>> +49 (0)176 34232684www.plantix.net <http://plantix.net/>
>>  PEAT GmbH
>> Kastanienallee 4
>> 10435 Berlin // Germany
>>  <https://play.google.com/store/apps/details?id=com.peat.GartenBank>Download
the App! <https://play.google.com/store/apps/details?id=com.peat.GartenBank>
>>
>>
>
>
> --
>
> Pascal Gula
> Senior Data Engineer / Scientist
> +49 (0)176 34232684www.plantix.net <http://plantix.net/>
>  PEAT GmbH
> Kastanienallee 4
> 10435 Berlin // Germany
>  <https://play.google.com/store/apps/details?id=com.peat.GartenBank>Download the
App! <https://play.google.com/store/apps/details?id=com.peat.GartenBank>
>
>

Mime
View raw message