Return-Path: X-Original-To: apmail-flink-dev-archive@www.apache.org Delivered-To: apmail-flink-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E2418176F6 for ; Thu, 5 Feb 2015 17:19:07 +0000 (UTC) Received: (qmail 91728 invoked by uid 500); 5 Feb 2015 17:19:07 -0000 Delivered-To: apmail-flink-dev-archive@flink.apache.org Received: (qmail 91669 invoked by uid 500); 5 Feb 2015 17:19:07 -0000 Mailing-List: contact dev-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list dev@flink.apache.org Received: (qmail 91658 invoked by uid 99); 5 Feb 2015 17:19:07 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Feb 2015 17:19:07 +0000 Received: from mail-lb0-f177.google.com (mail-lb0-f177.google.com [209.85.217.177]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 461D71A0041 for ; Thu, 5 Feb 2015 17:19:07 +0000 (UTC) Received: by mail-lb0-f177.google.com with SMTP id p9so9518042lbv.8 for ; Thu, 05 Feb 2015 09:19:05 -0800 (PST) X-Received: by 10.112.180.135 with SMTP id do7mr4876431lbc.23.1423156745868; Thu, 05 Feb 2015 09:19:05 -0800 (PST) MIME-Version: 1.0 Received: by 10.152.103.238 with HTTP; Thu, 5 Feb 2015 09:18:44 -0800 (PST) In-Reply-To: References: From: Robert Metzger Date: Thu, 5 Feb 2015 18:18:44 +0100 Message-ID: Subject: Fwd: [jira] [Created] (MRQL-66) Add support for MRQL streaming in Flink streaming mode To: dev@flink.apache.org Content-Type: multipart/alternative; boundary=001a11c25f724c1ffd050e5a8082 --001a11c25f724c1ffd050e5a8082 Content-Type: text/plain; charset=UTF-8 Just FYI from the MRQL mailing list. Maybe somebody from the streaming folks want to give some advice or help. ---------- Forwarded message ---------- From: Leonidas Fegaras (JIRA) Date: Thu, Feb 5, 2015 at 5:56 PM Subject: [jira] [Created] (MRQL-66) Add support for MRQL streaming in Flink streaming mode To: dev@mrql.incubator.apache.org Leonidas Fegaras created MRQL-66: ------------------------------------ Summary: Add support for MRQL streaming in Flink streaming mode Key: MRQL-66 URL: https://issues.apache.org/jira/browse/MRQL-66 Project: MRQL Issue Type: New Feature Components: Run-Time/Flink, Streaming Affects Versions: 0.9.6 Reporter: Leonidas Fegaras Priority: Critical The new extension, MRQL Streaming, works fine with Spark Streaming (see MRQL-63) but it would be nice if we make it work with Flink Streaming too. It was easy to make it work with Spark Streaming: Data in one sliding window in a Spark's DStream is viewed as an RDD. So a DStream can be viewed as a continuous sequence of RDDs. A DStream has a method foreachRDD that applies a function to each RDD in the stream. So to implement MRQL Streaming, we just had to use the MRQL Spark evaluator (a function from RDD to RDD) as an argument to foreachRDD. For Flink Streaming, the implementation will be more complicated. A Flink Streaming DataStream doesn't provide a hook to a DataSet object. I am guessing that this is because Flink Streaming is far more general than Spark Streaming (it's not just sliding windows) and because Flink Streaming needs to do special optimizations. So we need to copy the FlinkEvaluator class into a new class FlinkStreaming and change all methods to be on DataStream instead of DataSet. Many DataSet methods have an equivalent in DataStream but some are missing. I have already provided the input formats for streaming (method FlinkStreaming.stream_source) but we need to write a stream evaluator for MRQL plans. Any volunteer? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --001a11c25f724c1ffd050e5a8082--