From dev-return-8063-archive-asf-public=cust-asf.ponee.io@beam.apache.org Sun Feb 18 19:25:17 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id A909318064D for ; Sun, 18 Feb 2018 19:25:16 +0100 (CET) Received: (qmail 40277 invoked by uid 500); 18 Feb 2018 18:25:15 -0000 Mailing-List: contact dev-help@beam.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@beam.apache.org Delivered-To: mailing list dev@beam.apache.org Received: (qmail 40262 invoked by uid 99); 18 Feb 2018 18:25:14 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Feb 2018 18:25:14 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 68CCF1800EB for ; Sun, 18 Feb 2018 18:25:14 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id FULEq1kf0Pcp for ; Sun, 18 Feb 2018 18:25:11 +0000 (UTC) Received: from mail-yb0-f180.google.com (mail-yb0-f180.google.com [209.85.213.180]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 615125F5B3 for ; Sun, 18 Feb 2018 18:25:10 +0000 (UTC) Received: by mail-yb0-f180.google.com with SMTP id o1-v6so2185482ybk.10 for ; Sun, 18 Feb 2018 10:25:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=2AlpHjYM7OP1cFPEme0FKGfmvY3puwkg/WTreU0tKNo=; b=EQGdzc2CU1sIOcEOry2vGOlQMVgi+PFTOcMBb2cWv8qpch8pnJArhqArXb/AINMCCC BdYVBfl2zSnIaCaH8m47z5ic6FJmWbr53m18ZTKftsh47yg2SGLlrdGE91BPi/xdNf1m sx/UGEPl6oPP7CbePX0+VATB+xMx9+NOiIb3pDdxxZDnO08zp+HJYJJAOTdbTlLE8dg9 YJb+CMrmrNxyYZZX7a0ueivsI3aQIaCNoULWZRMLAuHXbyKemLOuD1ZMH7GpKeW5e9+d oDs+sOhite1mnmTJn6gpLToY44VEJ6Bu/4vs+ISeqs39ai1BKylNhQx2QCAI646+TgYh slFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=2AlpHjYM7OP1cFPEme0FKGfmvY3puwkg/WTreU0tKNo=; b=ls7tX5+7v0oxKJ9Cd0OVShXSATBOa1/imenP+0nG1zSO+UdB1eV+9xJcyB8EDU9XlS mconsDZp0pgafhTzpWPqyZ7xdZx2PbFB9CDXRIIy/k5f2m7ev91zLVmG+8yy7CrK6BIX 3fzcBJSYPTJtjxDMdC16DtChU0BVfJ06NpUX6bBWOREf1GBBmwc2JU+nKO9MJGaFaE1C ASjjlryxkjVCUmIXk588IibThm37gDO9Wpbnp/xcAfCV3nLr8cMTqlUaKyHkKKPMA5kb R6LeH2EI9GTyxCoAJaqUmGjMinwhe+r96YD/3/iwwsrPfEK1WjdyDK38xPLFrqqDFX7a n9ew== X-Gm-Message-State: APf1xPCYsTn17qQna/5U/YUGWdOwK5Je670Pe6vks4n6imV4q8NMRC0Z 7IYAlvTfMBxAYxuRMM/vW62sZ6J8/Lsq23eey7w= X-Google-Smtp-Source: AH8x225CdFjl2NHX7HnrY4VOqlt6FNGrVQxUUrg8dFCbGdP83n13BIJN46tvli24ZhRnBy+B+qOZ/2fzQTDba9CEfd8= X-Received: by 10.37.136.135 with SMTP id d7mr8665691ybl.399.1518978309675; Sun, 18 Feb 2018 10:25:09 -0800 (PST) MIME-Version: 1.0 Received: by 2002:a25:30c3:0:0:0:0:0 with HTTP; Sun, 18 Feb 2018 10:24:49 -0800 (PST) In-Reply-To: References: <02c86b31-70c9-a574-81ca-b1e2a000ac8d@nanthrax.net> From: Romain Manni-Bucau Date: Sun, 18 Feb 2018 19:24:49 +0100 Message-ID: Subject: Re: @TearDown guarantees To: dev@beam.apache.org Content-Type: multipart/alternative; boundary="f403043842b491e1c8056580b276" --f403043842b491e1c8056580b276 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable 2018-02-18 19:19 GMT+01:00 Eugene Kirpichov : > FinishBundle has a stronger guarantee: if the pipeline succeeded, then it > has been called for every succeeded bundle, and succeeded bundles togethe= r > cover the entire input PCollection. Of course, it may not have been calle= d > for failed bundles. > To anticipate a possible objection "why not also keep retrying Teardown > until it succeeds" - because if Teardown wasn't called on a DoFn instance= , > it's because the instance no longer exists and there's nothing to call it > on. > > Please take a look at implementations of WriteFiles and BigQueryIO.read() > and write() to see how cleanup of heavyweight resources (large number of > temp files, temporary BigQuery datasets) can be achieved reliably to the > extent possible. > Do you mean passing state accross the fn and having a fn responsible of the cleanup? Kind of making the teardown a processelement? This is a nice workaround but it is not always possible as mentionned. Ismael even has a nice case where this just fails and teardown would work - was with AWS, not a bigquery bug, but same design. > > On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau > wrote: > >> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov : >> >>> "Machine state" is overly low-level because many of the possible reason= s >>> can happen on a perfectly fine machine. >>> If you'd like to rephrase it to "it will be called except in various >>> situations where it's logically impossible or impractical to guarantee = that >>> it's called", that's fine. Or you can list some of the examples above. >>> >> >> Sounds ok to me >> >> >>> >>> The main point for the user is, you *will* see non-preventable >>> situations where it couldn't be called - it's not just intergalactic >>> crashes - so if the logic is very important (e.g. cleaning up a large >>> amount of temporary files, shutting down a large number of VMs you star= ted >>> etc), you have to express it using one of the other methods that have >>> stricter guarantees (which obviously come at a cost, e.g. no >>> pass-by-reference). >>> >> >> FinishBundle has the exact same guarantee sadly so not which which other >> method you speak about. Concretely if you make it really unreliable - th= is >> is what best effort sounds to me - then users can use it to clean anythi= ng >> but if you make it "can happen but it is unexpected and means something >> happent" then it is fine to have a manual - or auto if fancy - recovery >> procedure. This is where it makes all the difference and impacts the >> developpers, ops (all users basically). >> >> >>> >>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau < >>> rmannibucau@gmail.com> wrote: >>> >>>> Agree Eugene except that "best effort" means that. It is also often >>>> used to say "at will" and this is what triggered this thread. >>>> >>>> I'm fine using "except if the machine state prevents it" but "best >>>> effort" is too open and can be very badly and wrongly perceived by use= rs >>>> (like I did). >>>> >>>> >>>> Romain Manni-Bucau >>>> @rmannibucau | Blog >>>> | Old Blog >>>> | Github >>>> | LinkedIn >>>> | Book >>>> >>>> >>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov : >>>> >>>>> It will not be called if it's impossible to call it: in the example >>>>> situation you have (intergalactic crash), and in a number of more com= mon >>>>> cases: eg in case the worker container has crashed (eg user code in a >>>>> different thread called a C library over JNI and it segfaulted), JVM = bug, >>>>> crash due to user code OOM, in case the worker has lost network >>>>> connectivity (then it may be called but it won't be able to do anythi= ng >>>>> useful), in case this is running on a preemptible VM and it was preem= pted >>>>> by the underlying cluster manager without notice or if the worker was= too >>>>> busy with other stuff (eg calling other Teardown functions) until the >>>>> preemption timeout elapsed, in case the underlying hardware simply fa= iled >>>>> (which happens quite often at scale), and in many other conditions. >>>>> >>>>> "Best effort" is the commonly used term to describe such behavior. >>>>> Please feel free to file bugs for cases where you observed a runner n= ot >>>>> call Teardown in a situation where it was possible to call it but the >>>>> runner made insufficient effort. >>>>> >>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau < >>>>> rmannibucau@gmail.com> wrote: >>>>> >>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov : >>>>>> >>>>>>> >>>>>>> >>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau < >>>>>>> rmannibucau@gmail.com> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Le 18 f=C3=A9vr. 2018 00:23, "Kenneth Knowles" a >>>>>>>> =C3=A9crit : >>>>>>>> >>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau < >>>>>>>> rmannibucau@gmail.com> wrote: >>>>>>>>> >>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying to >>>>>>>>> write an IO for system $x and it requires the following initializ= ation and >>>>>>>>> the following cleanup logic and the following processing in betwe= en") I'll >>>>>>>>> be better able to help you. >>>>>>>>> >>>>>>>>> >>>>>>>>> Take a simple example of a transform requiring a connection. Usin= g >>>>>>>>> bundles is a perf killer since size is not controlled. Using tear= down >>>>>>>>> doesnt allow you to release the connection since it is a best eff= ort thing. >>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or pr= events you >>>>>>>>> to launch other processings - concurrent limit. >>>>>>>>> >>>>>>>> >>>>>>>> For this example @Teardown is an exact fit. If things die so badly >>>>>>>> that @Teardown is not called then nothing else can be called to cl= ose the >>>>>>>> connection either. What AWS service are you thinking of that stays= open for >>>>>>>> a long time when everything at the other end has died? >>>>>>>> >>>>>>>> >>>>>>>> You assume connections are kind of stateless but some (proprietary= ) >>>>>>>> protocols requires some closing exchanges which are not only "im l= eaving". >>>>>>>> >>>>>>>> For aws i was thinking about starting some services - machines - o= n >>>>>>>> the fly in a pipeline startup and closing them at the end. If tear= down is >>>>>>>> not called you leak machines and money. You can say it can be done= another >>>>>>>> way...as the full pipeline ;). >>>>>>>> >>>>>>>> I dont want to be picky but if beam cant handle its components >>>>>>>> lifecycle it can be used at scale for generic pipelines and if bou= nd to >>>>>>>> some particular IO. >>>>>>>> >>>>>>>> What does prevent to enforce teardown - ignoring the interstellar >>>>>>>> crash case which cant be handled by any human system? Nothing tech= nically. >>>>>>>> Why do you push to not handle it? Is it due to some legacy code on= dataflow >>>>>>>> or something else? >>>>>>>> >>>>>>> Teardown *is* already documented and implemented this way >>>>>>> (best-effort). So I'm not sure what kind of change you're asking fo= r. >>>>>>> >>>>>> >>>>>> Remove "best effort" from the javadoc. If it is not call then it is = a >>>>>> bug and we are done :). >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>>> Also what does it mean for the users? Direct runner does it so if = a >>>>>>>> user udes the RI in test, he will get a different behavior in prod= ? Also >>>>>>>> dont forget the user doesnt know what the IOs he composes use so t= his is so >>>>>>>> impacting for the whole product than he must be handled IMHO. >>>>>>>> >>>>>>>> I understand the portability culture is new in big data world but >>>>>>>> it is not a reason to ignore what people did for years and do it w= rong >>>>>>>> before doing right ;). >>>>>>>> >>>>>>>> My proposal is to list what can prevent to guarantee - in the >>>>>>>> normal IT conditions - the execution of teardown. Then we see if w= e can >>>>>>>> handle it and only if there is a technical reason we cant we make = it >>>>>>>> experimental/unsupported in the api. I know spark and flink can, a= ny >>>>>>>> unknown blocker for other runners? >>>>>>>> >>>>>>>> Technical note: even a kill should go through java shutdown hooks >>>>>>>> otherwise your environment (beam enclosing software) is fully unha= ndled and >>>>>>>> your overall system is uncontrolled. Only case where it is not tru= e is when >>>>>>>> the software is always owned by a vendor and never installed on cu= stomer >>>>>>>> environment. In this case it belongd to the vendor to handle beam = API and >>>>>>>> not to beam to adjust its API for a vendor - otherwise all unsuppo= rted >>>>>>>> features by one runner should be made optional right? >>>>>>>> >>>>>>>> All state is not about network, even in distributed systems so thi= s >>>>>>>> is key to have an explicit and defined lifecycle. >>>>>>>> >>>>>>>> >>>>>>>> Kenn >>>>>>>> >>>>>>>> >>>>>>>> >>>> --f403043842b491e1c8056580b276 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
<= div dir=3D"ltr">


2018-02-18 19:19 GMT+01:00 Eugene Kirpichov = <kirpichov@google.com>:
FinishBundle has a stronger guarantee: if the pipeli= ne succeeded, then it has been called for every succeeded bundle, and succe= eded bundles together cover the entire input PCollection. Of course, it may= not have been called for failed bundles.
To anticipate a possibl= e objection "why not also keep retrying Teardown until it succeeds&quo= t; - because if Teardown wasn't called on a DoFn instance, it's bec= ause the instance no longer exists and there's nothing to call it on.

Please take a look at implementations of WriteFiles= and BigQueryIO.read() and write() to see how cleanup of heavyweight resour= ces (large number of temp files, temporary BigQuery datasets) can be achiev= ed reliably to the extent possible.

=
Do you mean passing state accross the fn and having a fn responsible o= f the cleanup? Kind of making the teardown a processelement? This is a nice= workaround but it is not always possible as mentionned. Ismael even has a = nice case where this just fails and teardown would work - was with AWS, not= a bigquery bug,=C2=A0 but same design.
=C2=A0

On Sun, Feb 18, 2018 at 9:56 AM Romain Manni= -Bucau <rmann= ibucau@gmail.com> wrote:
2018-02= -18 18:36 GMT+01:00 Eugene Kirpichov <kirpichov@google.com>:
"Machine state&= quot; is overly low-level because many of the possible reasons can happen o= n a perfectly fine machine.
If you'd like to rephrase it to "i= t will be called except in various situations where it's logically impo= ssible or impractical to guarantee that it's called", that's f= ine. Or you can list some of the examples above.

=
Sounds ok to me
=C2= =A0

The main point for the user is, you *will* see non-preventable situations= where it couldn't be called - it's not just intergalactic crashes = - so if the logic is very important (e.g. cleaning up a large amount of tem= porary files, shutting down a large number of VMs you started etc), you hav= e to express it using one of the other methods that have stricter guarantee= s (which obviously come at a cost, e.g. no pass-by-reference).
<= /blockquote>

FinishBundle has the exact sa= me guarantee sadly so not which which other method you speak about. Concret= ely if you make it really unreliable - this is what best effort sounds to m= e - then users can use it to clean anything but if you make it "can ha= ppen but it is unexpected and means something happent" then it is fine= to have a manual - or auto if fancy - recovery procedure. This is where it= makes all the difference and impacts the developpers, ops (all users basic= ally).
<= div class=3D"gmail_quote">
=C2=A0
<= div class=3D"m_-6760369799346795844m_-5435767949025586018HOEnZb">

On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau= <rmannibucau= @gmail.com> wrote:
Agree Eugene except that "best effort" means that. It is= also often used to say "at will" and this is what triggered this= thread.

I'm fine using "except if the machine = state prevents it" but "best effort" is too open and can be = very badly and wrongly perceived by users (like I did).

=

Romain Manni-Bucau
@rmannibucau | =C2=A0Blog=C2=A0| Old Blog |=C2=A0Github=C2=A0| LinkedIn= =C2=A0| Book
=

2018-02-18 = 18:13 GMT+01:00 Eugene Kirpichov <kirpichov@google.com>:<= br>

It will not be called if i= t's impossible to call it: in the example situation you have (intergala= ctic crash), and in a number of more common cases: eg in case the worker co= ntainer has crashed (eg user code in a different thread called a C library = over JNI and it segfaulted), JVM bug, crash due to user code OOM, in case t= he worker has lost network connectivity (then it may be called but it won&#= 39;t be able to do anything useful), in case this is running on a preemptib= le VM and it was preempted by the underlying cluster manager without notice= or if the worker was too busy with other stuff (eg calling other Teardown = functions) until the preemption timeout elapsed, in case the underlying har= dware simply failed (which happens quite often at scale), and in many other= conditions.

"Best effort" is the commonly used term to describ= e such behavior. Please feel free to file bugs for cases where you observed= a runner not call Teardown in a situation where it was possible to call it= but the runner made insufficient effort.


On Sun, Feb 18, 2018, 9:02 = AM Romain Manni-Bucau <rmannibucau@gmail.com> wrote:
2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <= ;kirpichov@google= .com>:


On Sun, Feb 18, 2018, 2:06 AM Romain = Manni-Bucau <= rmannibucau@gmail.com> wrote:


Le=C2=A018 f=C3=A9vr. 2018 00:23, "Kenneth Knowles" &= lt;klk@google.com&g= t; a =C3=A9crit=C2=A0:
On Sat, Feb 1= 7, 2018 at 3:09 PM, Romain Manni-Bucau <rmannibucau@gmail.com><= /span> wrote:
If you give an ex= ample of a high-level need (e.g. "I'm trying to write an IO for sy= stem $x and it requires the following initialization and the following clea= nup logic and the following processing in between") I'll be better= able to help you.

Take a simple example of= a transform requiring a connection. Using bundles is a perf killer since s= ize is not controlled. Using teardown doesnt allow you to release the conne= ction since it is a best effort thing. Not releasing the connection makes y= ou pay a lot - aws ;) - or prevents you to launch other processings - concu= rrent limit.

For this exa= mple @Teardown is an exact fit. If things die so badly that @Teardown is no= t called then nothing else can be called to close the connection either. Wh= at AWS service are you thinking of that stays open for a long time when eve= rything at the other end has died?

You assume connections are kind of stateless but some (proprietar= y) protocols requires some closing exchanges which are not only "im le= aving".

For aws i w= as thinking about starting some services - machines - on the fly in a pipel= ine startup and closing them at the end. If teardown is not called you leak= machines and money. You can say it can be done another way...as the full p= ipeline ;).

I dont want = to be picky but if beam cant handle its components lifecycle it can be used= at scale for generic pipelines and if bound to some particular IO.

What does prevent to enforce te= ardown - ignoring the interstellar crash case which cant be handled by any = human system? Nothing technically. Why do you push to not handle it? Is it = due to some legacy code on dataflow or something else?
Teardown *is* already documented and implemented this= way (best-effort). So I'm not sure what kind of change you're aski= ng for.

Remove "be= st effort" from the javadoc. If it is not call then it is a bug and we= are done :).
=C2=A0


Also wh= at does it mean for the users? Direct runner does it so if a user udes the = RI in test, he will get a different behavior in prod? Also dont forget the = user doesnt know what the IOs he composes use so this is so impacting for t= he whole product than he must be handled IMHO.

<= /div>
I understand the portability culture is new in big d= ata world but it is not a reason to ignore what people did for years and do= it wrong before doing right ;).

My proposal is to list what can prevent to guarantee - in the nor= mal IT conditions - the execution of teardown. Then we see if we can handle= it and only if there is a technical reason we cant we make it experimental= /unsupported in the api. I know spark and flink can, any unknown blocker fo= r other runners?

Technic= al note: even a kill should go through java shutdown hooks otherwise your e= nvironment (beam enclosing software) is fully unhandled and your overall sy= stem is uncontrolled. Only case where it is not true is when the software i= s always owned by a vendor and never installed on customer environment. In = this case it belongd to the vendor to handle beam API and not to beam to ad= just its API for a vendor - otherwise all unsupported features by one runne= r should be made optional right?

All state is not about network, even in distributed systems so th= is is key to have an explicit and defined lifecycle.


Kenn<= /div>




--f403043842b491e1c8056580b276--