Return-Path: X-Original-To: apmail-apex-dev-archive@minotaur.apache.org Delivered-To: apmail-apex-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BAA661865A for ; Mon, 28 Sep 2015 17:31:22 +0000 (UTC) Received: (qmail 53017 invoked by uid 500); 28 Sep 2015 17:30:48 -0000 Delivered-To: apmail-apex-dev-archive@apex.apache.org Received: (qmail 52957 invoked by uid 500); 28 Sep 2015 17:30:48 -0000 Mailing-List: contact dev-help@apex.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@apex.incubator.apache.org Delivered-To: mailing list dev@apex.incubator.apache.org Received: (qmail 52942 invoked by uid 99); 28 Sep 2015 17:30:48 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Sep 2015 17:30:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id EB609C08AC for ; Mon, 28 Sep 2015 17:30:47 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.299 X-Spam-Level: ** X-Spam-Status: No, score=2.299 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id bPxe-AWhDTXc for ; Mon, 28 Sep 2015 17:30:43 +0000 (UTC) Received: from mail-qg0-f50.google.com (mail-qg0-f50.google.com [209.85.192.50]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 85E65204C9 for ; Mon, 28 Sep 2015 17:30:43 +0000 (UTC) Received: by qgez77 with SMTP id z77so127498458qge.1 for ; Mon, 28 Sep 2015 10:30:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=IXmOPDcFgBdSEelOibi9jbp7BAcYrMsXy5Bc2bqhFHk=; b=SqYKDmoCs53xByfSdVGyrExxybdugsp2jNsI8r3e8BS1A2u2xjOBPn2p6nIFobcGQX NmGZBoSZ8tjmFGVEWp/fIe0fc05nbNtA9JVyg2emFrCyV6SgIrZHyHYJq0/p/lL4OSbc FMGuvSl7YF67Bm2gH2Sw5K2v2X89xfd4lAJ6Jo+uXKafTRsL3wCBJt/vgy1UCgk8Dtsm tpI/6BaZ3MrIVd6il98YPsRr6eFfYZ524HJxyBZE+CxKC0vb+VdDk8UbuFNJTsHn0b64 LRTvapTy4lkVG4orGnuUE2Yut8eOJ2H8RxSScw15iOlKQoe1SrI+5rFshgBzyTpC1Uk+ EVfw== X-Gm-Message-State: ALoCoQnR8ZKwTMt0oM0Isx+vyM9j9Mj542rH+v8azdCqM/hNfZVuCqTzbPbkfm1qjZqMZ3CaGv7U X-Received: by 10.140.17.41 with SMTP id 38mr24339076qgc.55.1443461442639; Mon, 28 Sep 2015 10:30:42 -0700 (PDT) MIME-Version: 1.0 Received: by 10.140.23.197 with HTTP; Mon, 28 Sep 2015 10:30:13 -0700 (PDT) In-Reply-To: References: From: Pramod Immaneni Date: Mon, 28 Sep 2015 10:30:13 -0700 Message-ID: Subject: Re: dynamic application properties proposal To: Timothy Farkas Cc: dev@apex.incubator.apache.org Content-Type: multipart/alternative; boundary=001a11c0bab889318c0520d20e07 --001a11c0bab889318c0520d20e07 Content-Type: text/plain; charset=UTF-8 An optimization that can be done is the below steps are done only when there only when there are more than one input operator but in case of a single input operator case which is more common the property change tuple can be inserted at the next possible window without having to temporarily pause the flow. On Mon, Sep 28, 2015 at 10:27 AM, Timothy Farkas wrote: > Furthermore this approach is not limited to DAGs with a single input > operator. In the case where a DAG has multiple input operators property > changes can be set within the same window across all input operators by > enforcing some synchronization at the input operator level when setting the > property. This synchronization would look like the following: > > 1. When receiving a property change request, ask all input operators to > stop and send their current window. > 2. Take the max window + 1 (not technically correct but you get the > idea) > 3. Send the property change request to all the input operators and tell > them to apply the change at the maximum window id + 1. > 4. Resume the input operators. > > This ensures that the change is applied at the same window Id and also > ensures that the change is applied at a window ID that the input operator > had never played before. Therefore property changes will not interfere with > the idempotence of operators. > > > On Mon, Sep 28, 2015 at 9:17 AM, Pramod Immaneni > wrote: > >> Apex support modification of operator properties at runtime but the >> current implemenations has the following shortcomings. >> >> 1. Property is not set across all partitions on the same window as >> individual partitions can be on different windows when property change is >> initiated from client resulting in inconsistency of data for those windows. >> I am being generous using the word inconsistent. >> 2. Sometimes properties need to be set on more than one logical operators >> at the same time to achieve the change the user is seeking. Today they will >> be two separate changes happening on two different windows again resulting >> in inconsistent data for some windows. These would need to happen as a >> single transaction. >> 3. If there is an operator failure before a committed checkpoint after an >> operator property is dynamically changed the operator will restart with the >> old property and the change will not be re-applied. >> >> Tim and myself did some brainstorming and we have a proposal to overcome >> these shortcomings. The main problem in all the above cases is that the >> property changes are happening out-of-band of data flow and hence >> independent of windowing. The proposal is to bring the property change >> request into the in-band dataflow so that they are handled consistently >> with windowing and handled distributively. >> >> The idea is to inject a special property change tuple containing the >> property changes and the identification information of the operator's they >> affect into the dataflow at the input operator. The tuple will be injected >> at window boundary after end window and before begin window and as this >> tuple flows through the DAG the intended operators properties will be >> modifed. They will all be modified consistently at the same window. The >> tuple can contain more than one property changes for more than one logical >> operators and the change will be applied consistently to the different >> logical operators at the same window. In case of failure the replay of >> tuples will ensure that the property change gets reapplied at the correct >> window. >> >> Please give your feedback and input on what you think about this proposal. >> >> Thanks >> > > --001a11c0bab889318c0520d20e07--