Return-Path: X-Original-To: apmail-falcon-dev-archive@minotaur.apache.org Delivered-To: apmail-falcon-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 21F8417CA7 for ; Thu, 12 Feb 2015 15:52:00 +0000 (UTC) Received: (qmail 63732 invoked by uid 500); 12 Feb 2015 15:51:25 -0000 Delivered-To: apmail-falcon-dev-archive@falcon.apache.org Received: (qmail 63695 invoked by uid 500); 12 Feb 2015 15:51:25 -0000 Mailing-List: contact dev-help@falcon.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@falcon.apache.org Delivered-To: mailing list dev@falcon.apache.org Received: (qmail 63684 invoked by uid 99); 12 Feb 2015 15:51:25 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Feb 2015 15:51:25 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sriksun@hotmail.com designates 65.55.111.173 as permitted sender) Received: from [65.55.111.173] (HELO BLU004-OMC4S34.hotmail.com) (65.55.111.173) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Feb 2015 15:51:00 +0000 Received: from BLU179-W84 ([65.55.111.135]) by BLU004-OMC4S34.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.22751); Thu, 12 Feb 2015 07:50:58 -0800 X-TMN: [DdH9H926VrWE9IqMimQwKTge1rEUP7dT] X-Originating-Email: [sriksun@hotmail.com] Message-ID: Content-Type: multipart/alternative; boundary="_5cb3a6ab-7950-4054-b7c8-850d4b6c2319_" From: Srikanth Sundarrajan To: "dev@falcon.apache.org" Subject: RE: Streaming Feed Date: Thu, 12 Feb 2015 21:20:57 +0530 Importance: Normal In-Reply-To: References: ,<54DB1B33.5010500@nanthrax.net>, MIME-Version: 1.0 X-OriginalArrivalTime: 12 Feb 2015 15:50:58.0181 (UTC) FILETIME=[B1667B50:01D046DB] X-Virus-Checked: Checked by ClamAV on apache.org --_5cb3a6ab-7950-4054-b7c8-850d4b6c2319_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable This is been an idea lingering in my mind for a while. I will be very suppo= rtive of any effort to create a stream abstract similar in lines with feed = (or this may not be required=2C if we do a major overhaul with respect to o= rchestration in falcon=2C where tight requirement of feed having a frequenc= y is done away with) and have process work with these streams. In which cas= e the orchestration should happen through Nimbus or Spark Master instead of= Oozie. In other words: * Feed/Stream to be a primitive entity in falcon which declares that there = is a continuous flow of data as per schema and is not bound to any arrival = periodicity * Replication/Mirroring on this would essentially use standard data transpo= rt mechanisms to ship data also on a streaming fashion * Processes that are defined over these continuous streams are to be orches= trated over an appropriate engine such as Nimbus (in case of Storm) or simi= lar system. Processes that are defined in this way also doesn't have period= icity and are continuous.=20 This topic requires more conversation before we figure the way forward. Am = assuming=2C more than one of us are thinking about this. Regards Srikanth Sundarrajan > Date: Wed=2C 11 Feb 2015 15:30:47 +0530 > Subject: Re: Streaming Feed > From: sharad@apache.org > To: dev@falcon.apache.org >=20 > Thanks Jean=2C this will be quite useful. I am wondering if this will req= uire > a new partitioning construct in the feed as well like micro-batches=2C et= c. >=20 > Sharad >=20 > On Wed=2C Feb 11=2C 2015 at 2:34 PM=2C Jean-Baptiste Onofr=E9 > wrote: >=20 > > Hi Sharad=2C > > > > I sent an e-mail last week about support of Spark (SparkStreaming) in > > workflow/process. It's basically very close to what you propose. > > > > IMHO=2C it should be a new impl of workflow or at least the support of = a new > > kind of processes (it's what I have in mind). > > > > Regards > > JB > > > > > > On 02/11/2015 09:38 AM=2C Sharad Agarwal wrote: > > > >> I am looking for a generic schema aware feed construct for streaming > >> workflow. The schema can be managed by a catalog service like HCatalog= . > >> The > >> streaming workflow executor would be a system like > >> Storm/SparkStreaming/Samza. > >> > >> I want to know if this is the right thing to be supported in Falcon an= d if > >> yes what is the plugging interface for that. Would this be a new > >> implementation of workflow engine ? > >> > >> Thanks > >> Sharad > >> > >> > > -- > > Jean-Baptiste Onofr=E9 > > jbonofre@apache.org > > http://blog.nanthrax.net > > Talend - http://www.talend.com > > = --_5cb3a6ab-7950-4054-b7c8-850d4b6c2319_--