Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D18B2200B82 for ; Fri, 16 Sep 2016 19:02:47 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D02BA160AC4; Fri, 16 Sep 2016 17:02:47 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 20F91160AB7 for ; Fri, 16 Sep 2016 19:02:46 +0200 (CEST) Received: (qmail 67127 invoked by uid 500); 16 Sep 2016 17:02:46 -0000 Mailing-List: contact dev-help@apex.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@apex.apache.org Delivered-To: mailing list dev@apex.apache.org Received: (qmail 67115 invoked by uid 99); 16 Sep 2016 17:02:46 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Sep 2016 17:02:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 9C9F3C0B4C for ; Fri, 16 Sep 2016 17:02:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.279 X-Spam-Level: * X-Spam-Status: No, score=1.279 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=datatorrent-com.20150623.gappssmtp.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id KYcaIaaTWHBX for ; Fri, 16 Sep 2016 17:02:43 +0000 (UTC) Received: from mail-qk0-f173.google.com (mail-qk0-f173.google.com [209.85.220.173]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 5EAF35FB83 for ; Fri, 16 Sep 2016 17:02:43 +0000 (UTC) Received: by mail-qk0-f173.google.com with SMTP id h8so88709881qka.1 for ; Fri, 16 Sep 2016 10:02:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=datatorrent-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=Tz7jFk/TNfHrB6tE/eBau1yM7AvZQnYvrUlj2cuJTZM=; b=1ahP0EzDOklGTlFrp/eP4S2VcqyldJvZOnD4/AGKruFlOljd/h/xnrNEAC2ca8iI87 gquq9msSa9kETzpLAzmqqG6pIpW2ipnFhNEaCOkx7CakWGLfLwVojSJqhGzh/+figcm+ j471i6zF27X63k7nCacc4Brr5Vy8JxOISeEV78q7GquUN3kWJ/MaAW/mbFnjUXUM5oLO mklplRyVYjty8Y/WLPbVsIn7HiB1rbNMf+IA8pD0KHwCdfxWZRqedxc6sUjPjsWJVrl5 tnyVnFMerydrX41IJ/l9Y+po5oqb/XRnaVSRLPbOtdKcHWvwyyb0SRCMrU/jklZ28fu/ tWiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=Tz7jFk/TNfHrB6tE/eBau1yM7AvZQnYvrUlj2cuJTZM=; b=fGzj+KTf0cJA2JiJPvzggsHPigHSAM0yuJK3I+JrMd10V51uLXTbtCgji6zVagqZyc ionsPy13q/Vy+JmAapIVgDFnTzaqvzfOJxyvj1V47ggvjqmyQRv33pKzta/S2dFI1yfi xXuirWDWRbvWqnoprxFnu3BbrsfyRs90qLmHdUAEfr0URdnFErBYk2izlKCLverB3wu4 nmw8hs6+QSjCgwKutczglnCwyOMaVnGzXduIfbprBwfyVjcokCHPrK/niRTwAE1ZYijK 3YBJ485qGMh1NnXrXbysbM2CAjyDn9ecmMJB8Dr1nHREWNBTocgoRtJ2tBmYEwxs0M0y 4z3w== X-Gm-Message-State: AE9vXwMQprpuMjWl7+yx6ZBMk9XrGwkKErd5pUGLxgjBEVRTUUa4j11HK8sSYKMeBYW0p5FJaL8K9BXFklQa5Rpc X-Received: by 10.55.195.152 with SMTP id r24mr17705922qkl.28.1474045362727; Fri, 16 Sep 2016 10:02:42 -0700 (PDT) MIME-Version: 1.0 Received: by 10.55.45.129 with HTTP; Fri, 16 Sep 2016 10:02:42 -0700 (PDT) In-Reply-To: References: From: David Yan Date: Fri, 16 Sep 2016 10:02:42 -0700 Message-ID: Subject: Re: Watermark generation in Windowed Operators To: dev@apex.apache.org Content-Type: multipart/alternative; boundary=001a1147a1a23a89f3053ca2ee40 archived-at: Fri, 16 Sep 2016 17:02:48 -0000 --001a1147a1a23a89f3053ca2ee40 Content-Type: text/plain; charset=UTF-8 I think in theory, the watermark should be sent by the input operator since the input should have the knowledge of the criteria of lateness since it can depend on many factors like the time of the day, the source of the data (e.g. mobile data), that the WindowedOperator should in general make no assumption about. However, I think it's possible to implement some kind of watermark generation in the WindowedOperator itself if that knowledge is not available from the input. It's actually already doing that if you call the setFixedWatermark method, which will generate a watermark tuple, with a timestamp that is based on the derived time from the streaming window id, downstream for each streaming window. It's possible to add the support of heuristic watermark generation as well and you're welcome to take that up. For the Windowed Join operator, the watermark generated for downstream depends on the watermark arriving from each input stream, and it's not just a simple propagate. Shunxin can comment more on this. David On Thu, Sep 15, 2016 at 11:21 PM, Chinmay Kolhatkar wrote: > Hi All, > > I was looking at Windowed Operator APIs and have to mention they're pretty > nicely done. > > I have a question related to watermark generation. > > What I understood is that for completion of processing of an event window > one has provision for sending of watermark tuple from some previous stage > in the DAG. I want to know who should be doing that and when should be it > done. > > For e.g. I saw a PR of Windows Join Operator in apex-malhar and I would > like to use it in my application. Can someone give me an example of how a > DAG will look like with this operator which has a stage which generates > watermark? And how should that stage decide on when to generate a watermark > tuple? > > -Chinmay. > --001a1147a1a23a89f3053ca2ee40--