Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 18CA4200C36 for ; Fri, 10 Mar 2017 13:21:27 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 1488C160B79; Fri, 10 Mar 2017 12:21:27 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 5B5C8160B69 for ; Fri, 10 Mar 2017 13:21:26 +0100 (CET) Received: (qmail 56798 invoked by uid 500); 10 Mar 2017 12:21:25 -0000 Mailing-List: contact dev-help@apex.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@apex.apache.org Delivered-To: mailing list dev@apex.apache.org Received: (qmail 56785 invoked by uid 99); 10 Mar 2017 12:21:25 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Mar 2017 12:21:24 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 8D6C21A085B for ; Fri, 10 Mar 2017 12:21:24 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.629 X-Spam-Level: ** X-Spam-Status: No, score=2.629 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id ohm1XHKt9ajr for ; Fri, 10 Mar 2017 12:21:22 +0000 (UTC) Received: from mail-ua0-f178.google.com (mail-ua0-f178.google.com [209.85.217.178]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id B11F15F1EE for ; Fri, 10 Mar 2017 12:21:21 +0000 (UTC) Received: by mail-ua0-f178.google.com with SMTP id 72so111142606uaf.3 for ; Fri, 10 Mar 2017 04:21:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=eSKYp4EKVBaAgV6PHTLsyNpm7iXk/UndfHqnGbxlx5A=; b=MgYiToiVoiTdIi1qyJEAo8pbT5y8RFAf5vlK7BKzhDTi4IBy9KBjtnHcGKG50nyh8X vEDocWo/Letpenm9bnC1HgitN7wndLuNd6XQe7oWaAskIl6Ute4ZfMZGusPv0sjPqGO7 8pAQeM3AVAEdU8434LJKEJ+OLgL8euiHHR0YpYO4GTlneS406oj2NOt07Ulgiag9cMjB TrZWFhB9v2OZXHRwNQW3djapZPjn/ffgTSuaGlgto8w/nLikOWatMOMpzvx3EXz/Arch BNp31WKBeoatOm1a/ziRLm8xP8jISaQqAWqE5GxZ7C3g6RqfR51Hi1vsicHy2TqR4y/c 54qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=eSKYp4EKVBaAgV6PHTLsyNpm7iXk/UndfHqnGbxlx5A=; b=uBR3cbPlEDcobFwqpz0Rd2l2IX1upfJAicHUNpS5WCG79qUQ9yrz/GthCUxTZ1tORC gFWPV8ymCoxqjUFsQWJcPnGLyQPamFTIAGeYlgZuMCzgFGdEscxvWsMyBNUv7Jl0EYoh SI2Q3BTeT6LcXbO3/FSiB2E/cxrPdMvKw7RxQKrHZmDyZL2vZfluJkvpuC8FYGS4nLp1 lXL3PGc9yKwIM4zZzMVHhZoGMdTbZDlVd7DkO5IOsXpUld8E8417PgFE2lGt3vRuGz0U jiBzvkVx/VXajftiKt7aq95NrIk0InoYzzhjIc7HYi/x3GZbjf7eC72Lom3nonO7YV/5 ePyg== X-Gm-Message-State: AMke39nwChy8Oi0NPvQVs6VPQq/DrwZ82rkmdYlQJjIr+OXTyvZOxJNKwO5xxZK0Nbsn0GGCdeC9tRwuak5SPA== X-Received: by 10.31.98.66 with SMTP id w63mr7897325vkb.30.1489148480567; Fri, 10 Mar 2017 04:21:20 -0800 (PST) MIME-Version: 1.0 Received: by 10.103.121.131 with HTTP; Fri, 10 Mar 2017 04:21:20 -0800 (PST) In-Reply-To: References: From: AJAY GUPTA Date: Fri, 10 Mar 2017 17:51:20 +0530 Message-ID: Subject: Re: Watermark tuples in Apex To: dev@apex.apache.org Content-Type: multipart/alternative; boundary=94eb2c091dc233c514054a5f662a archived-at: Fri, 10 Mar 2017 12:21:27 -0000 --94eb2c091dc233c514054a5f662a Content-Type: text/plain; charset=UTF-8 Hi Bhupesh, For point 1, cant we make use of implicitWatermarkGenerator? Ajay On Wed, Mar 8, 2017 at 12:16 PM, Bhupesh Chawda wrote: > Hi All, > > Watermark tuples in Apex are very tightly coupled to event time processing. > For this reason, usually they are modeled as having a timestamp. > > public interface WatermarkTuple > { > long getTimestamp(); > } > > Even though, watermarks are meant for such time related processing, I think > we should expand the concept of watermarks for the following types: > > 1. Labelled watermarks > This could be useful in scenarios where instead of a timestamp (which is an > ordered field), we have categorical values. For example, consider tuples > which are labeled by city names. For each city, we want to have separate > windows and isolate the processing. If the watermark returns a different > city name, we end the previous window and start a new one. Or, in this case > we could make use of both high and low watermarks indicating the start and > end of a city's data. This could mean having multiple windows' data > incoming at the same time. > > 2. Ordered watermarks > Instead of having the ordered field as time, why not consider something > like an Ordered Watermark. TimeBased Watermarks could extend from that. > An ordered watermark could be used in case we have a sequence of data > tuples and we need to demarcate every n tuples. Even though we can say that > every n tuples the window is definitely closed, but the decision is made > only when the upstream sends the watermark tuple. The windowed operator > does not have any clue about it. It blindly opens and closes windows based > on watermarks received from upstream. This could mean different windows may > have different values of n. > > Please let me know your thoughts on this. > > ~ Bhupesh > --94eb2c091dc233c514054a5f662a--