Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id EE78B200CCC for ; Fri, 21 Jul 2017 23:11:16 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id ECB3C16CEED; Fri, 21 Jul 2017 21:11:16 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 17B8B16CEA6 for ; Fri, 21 Jul 2017 23:11:15 +0200 (CEST) Received: (qmail 66448 invoked by uid 500); 21 Jul 2017 21:11:14 -0000 Mailing-List: contact user-help@beam.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@beam.apache.org Delivered-To: mailing list user@beam.apache.org Received: (qmail 66434 invoked by uid 99); 21 Jul 2017 21:11:14 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Jul 2017 21:11:14 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 9BD85C00CE for ; Fri, 21 Jul 2017 21:11:13 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.378 X-Spam-Level: ** X-Spam-Status: No, score=2.378 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=google.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 6OI2ZthRSZHP for ; Fri, 21 Jul 2017 21:11:12 +0000 (UTC) Received: from mail-ua0-f182.google.com (mail-ua0-f182.google.com [209.85.217.182]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id C78855F2AA for ; Fri, 21 Jul 2017 21:11:11 +0000 (UTC) Received: by mail-ua0-f182.google.com with SMTP id q25so35111782uah.1 for ; Fri, 21 Jul 2017 14:11:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=nsFOi2DBRFMrx94JCqEiqV1C4+v4t3wfJ+QcNB+gX3E=; b=fAA/SakCPOKjB/pQG0As3Ru5QNaR5cXLN3UwkWqRbwuQSifYiJPoc65VyKxRXHrU5j ur0PItt2yd+qc459H5SPrl50l9//H3hBvgi3+PWqv4bI6ZU1Al/UVujqVL0FyZPvDijC 3/oNsl2kXvmWBZ3hmUwT9SvuZX2xPK3qP1gGWSASIed5gVKCzet1acwefG3hGuLzh9jx 3GzKN60rx5iFzLLMRy/5PqSkfBQNKkn3gya7z7ZSlHs0b9ylZAgyRgrIUCx3usGfg+do EmHSrkv9X59uql442V9c2iWfcrOaPcJgs2r0K0l0Jdh8ESl9vsVK8DdngOrncif5QDbb pnTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=nsFOi2DBRFMrx94JCqEiqV1C4+v4t3wfJ+QcNB+gX3E=; b=rD3a5COFxXJsaltqJejLAMkgJrci72u9/U1En8FWJnomrmJMKN1kNrd4smuxNnbay9 KtyD1CC5jEHXQBjrNmG7sDJbflG4Ww/l0s1ZkNEVv/idc/VBp8wW7ZCudAxC0P9AP6YI XZ4JSmcXx/+U2zqFmts/5bH9qdCdAVXWnRljfOSoUotVHJmBPU4tgroWr1kzq+5UaMGL uf39GqOGPAjZw/XuKvLXsAzpWl7ywQef1Yx1I+NUSLWATiRvcEJUZm2EGHRBGEAuxU0p Jpk1JN/6kzrS1a8HwuZT63ASgeksFNbDSUO+QUEjyRKkGBykSf3P8yvQFgVsev6q24U5 HwLw== X-Gm-Message-State: AIVw110CiV/HwRK5KaRbMiFXt348nc/x80HA8BVLKWc9fuwBD+rZ23Wu c1QJd5UMLXi7DEL2NguzqVFIeRvx8YuT8NU= X-Received: by 10.159.48.143 with SMTP id j15mr4575382uab.195.1500671464287; Fri, 21 Jul 2017 14:11:04 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Sourabh Bajaj Date: Fri, 21 Jul 2017 21:10:53 +0000 Message-ID: Subject: Re: [Python] Replace ParDo's DoFn after both are constructed To: user@beam.apache.org Content-Type: multipart/alternative; boundary="f403045e2a2a8e8a9b0554da4d02" archived-at: Fri, 21 Jul 2017 21:11:17 -0000 --f403045e2a2a8e8a9b0554da4d02 Content-Type: text/plain; charset="UTF-8" Hi, Is it possible to create class ErrorSieve(PTransform): def __init__ (dofn): def expand(): return ParDo(modifiedDoFn) that way your pipeline just looks like p | ErrorSieve(DoFn()) and you don't expose the ParDo to the user. Will this work for your usecase? -Sourabh On Fri, Jul 21, 2017 at 2:06 PM Dmitry Demeshchuk wrote: > Hi list, > > I'm trying to make a transformation function (let's call it ErrorSieve) > that would take a ParDo object as input and modify its underlying DoFn > object, basically adding extra logic on top of an underlying process() > method. > > Ideally for me, the example usage would be: > > ```python > p | ErrorSieve(beam.ParDo(MyDoFn()) > > or > > p | ErrorSieve(beam.FlatMap(lambda x: x + 1)) > ``` > > However, this would require me to butcher the internals of ParDo > mechanisms, especially since ParDo's make_fn() method gets called during > its transformation. My other thinking was to make it a fair and square DoFn: > > ```python > p | beam.ParDo(ErrorSieve(MyDoFn()) > ``` > > The only problem with this is that I can't use it with transforms like > FlatMap, which is a bit unfortunate. > > Do you think it's worth investigating how to implement the first approach, > or should I just instead settle with the second approach, using only custom > DoFns? > > Thank you. > > > -- > Best regards, > Dmitry Demeshchuk. > --f403045e2a2a8e8a9b0554da4d02 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,=C2=A0

Is it possible to create=C2= =A0

class ErrorSieve(PTransform):
=C2=A0= =C2=A0def __init__ (dofn):
=C2=A0 =C2=A0def expand():
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return ParDo(modifiedDoFn)

=
that way your pipeline just looks like p | ErrorSieve(DoFn()) an= d you don't expose the ParDo to the user.

Will= this work for your usecase?

-Sourabh
<= br>
On Fri, Jul 21, 2017 at 2:06= PM Dmitry Demeshchuk <dmitry@po= stmates.com> wrote:
Hi list,

I'm trying to make a transformatio= n function (let's call it ErrorSieve) that would take a ParDo object as= input and modify its underlying DoFn object, basically adding extra logic = on top of an underlying process() method.

Ideally = for me, the example usage would be:

```= python
p | ErrorSieve(beam.ParDo(MyDoFn())

or

p | ErrorSieve(beam.FlatMap(lambda x: x + 1)= )
```

However, this would require me to = butcher the internals of ParDo mechanisms, especially since ParDo's mak= e_fn() method gets called during its transformation. My other thinking was = to make it a fair and square DoFn:

```python
=
p | beam.ParDo(ErrorSieve(MyDoFn())
```

=
The only problem with this is that I can't use it with transforms = like FlatMap, which is a bit unfortunate.

Do you t= hink it's worth investigating how to implement the first approach, or s= hould I just instead settle with the second approach, using only custom DoF= ns?

Thank you.

--
Best re= gards,
Dmitry Demeshchuk.
--f403045e2a2a8e8a9b0554da4d02--