Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 1D75E200C6C for ; Fri, 21 Apr 2017 02:58:00 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 1BA59160B9F; Fri, 21 Apr 2017 00:58:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 63A30160BB0 for ; Fri, 21 Apr 2017 02:57:59 +0200 (CEST) Received: (qmail 51318 invoked by uid 500); 21 Apr 2017 00:57:58 -0000 Mailing-List: contact dev-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list dev@flink.apache.org Received: (qmail 51254 invoked by uid 99); 21 Apr 2017 00:57:57 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Apr 2017 00:57:57 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 84090C701D for ; Fri, 21 Apr 2017 00:57:57 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.398 X-Spam-Level: X-Spam-Status: No, score=-0.398 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, KAM_NUMSUBJECT=0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.796, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=google.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id T6e91u-FZmHU for ; Fri, 21 Apr 2017 00:57:55 +0000 (UTC) Received: from mail-ua0-f171.google.com (mail-ua0-f171.google.com [209.85.217.171]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 304A45FB49 for ; Fri, 21 Apr 2017 00:57:55 +0000 (UTC) Received: by mail-ua0-f171.google.com with SMTP id f10so68637401uaa.2 for ; Thu, 20 Apr 2017 17:57:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=oklkqWSYzn0kdqgvfWC0pDIGwzRXYST0baKN++ir3gA=; b=dYQx4no2WE2VHJMXoNK8vuf1zuTP1mEpkDu2rY+P6NX4NSqStKnMGMwr4VwqSm4auA aLf68fA8xLG66RNY+62zeDYKFiISQpWjtOIX/XCNBMHgsjxum9Y6JHTsKaRc+/P05bn1 kMbJQvAcJIpGIOPIvLByF+0uQ8DplYzFOkRok1kbXKJ3seQ0NGROqNblQMM212QQfk8m VkVI2mpZaEl4K/5FIjmdDA+hfXEEa5zowP7/kGulQxEA6SGjB86MiHkwzN7niD3kZvSG 4WKvJTE64qO0k05klaUbE3Ks4Gj5PObufNL2B3ABXqpcQimzDEF1vuLp2f9vToE5XGlk VtEQ== X-Gm-Message-State: AN3rC/6xS3Dr+Yzsn+aoWTw5rQCCPTJZl0ZCnEGh0egcYIK/TgquzKTu Y/0DgyqRxVZWBhEPyh+KoiP8sw4IixDHav0= X-Received: by 10.176.94.83 with SMTP id a19mr3422280uah.69.1492736273871; Thu, 20 Apr 2017 17:57:53 -0700 (PDT) MIME-Version: 1.0 From: Tyler Akidau Date: Fri, 21 Apr 2017 00:57:42 +0000 Message-ID: Subject: Towards a spec for robust streaming SQL, Part 1 To: "dev@beam.apache.org" , "dev@flink.apache.org" , "dev@calcite.apache.org" Content-Type: multipart/alternative; boundary=f403043eee34593201054da2bf68 archived-at: Fri, 21 Apr 2017 00:58:00 -0000 --f403043eee34593201054da2bf68 Content-Type: text/plain; charset=UTF-8 Hello Beam, Calcite, and Flink dev lists! Apologies for the big cross post, but I thought this might be something all three communities would find relevant. Beam is finally making progress on a SQL DSL utilizing Calcite, thanks to Mingmin Xu. As you can imagine, we need to come to some conclusion about how to elegantly support the full suite of streaming functionality in the Beam model in via Calcite SQL. You folks in the Flink community have been pushing on this (e.g., adding windowing constructs, amongst others, thank you! :-), but from my understanding we still don't have a full spec for how to support robust streaming in SQL (including but not limited to, e.g., a triggers analogue such as EMIT). I've been spending a lot of time thinking about this and have some opinions about how I think it should look that I've already written down, so I volunteered to try to drive forward agreement on a general streaming SQL spec between our three communities (well, technically I volunteered to do that w/ Beam and Calcite, but I figured you Flink folks might want to join in since you're going that direction already anyway and will have useful insights :-). My plan was to do this by sharing two docs: 1. The Beam Model : Streams & Tables - This one is for context, and really only mentions SQL in passing. But it describes the relationship between the Beam Model and the "streams & tables" way of thinking, which turns out to be useful in understanding what robust streaming in SQL might look like. Many of you probably already know some or all of what's in here, but I felt it was necessary to have it all written down in order to justify some of the proposals I wanted to make in the second doc. 2. A streaming SQL spec for Calcite - The goal for this doc is that it would become a general specification for what robust streaming SQL in Calcite should look like. It would start out as a basic proposal of what things *could* look like (combining both what things look like now as well as a set of proposed changes for the future), and we could all iterate on it together until we get to something we're happy with. At this point, I have doc #1 ready, and it's a bit of a monster, so I figured I'd share it and let folks hack at it with comments if they have any, while I try to get the second doc ready in the meantime. As part of getting doc #2 ready, I'll be starting a separate thread to try to gather input on what things are already in flight for streaming SQL across the various communities, to make sure the proposal captures everything that's going on as accurately as it can. If you have any questions or comments, I'm interested to hear them. Otherwise, here's doc #1, "The Beam Model : Streams & Tables": http://s.apache.org/beam-streams-tables -Tyler --f403043eee34593201054da2bf68--