falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Srikanth Sundarrajan <srik...@hotmail.com>
Subject RE: Streaming Feed
Date Thu, 12 Feb 2015 15:50:57 GMT
This is been an idea lingering in my mind for a while. I will be very supportive of any effort
to create a stream abstract similar in lines with feed (or this may not be required, if we
do a major overhaul with respect to orchestration in falcon, where tight requirement of feed
having a frequency is done away with) and have process work with these streams. In which case
the orchestration should happen through Nimbus or Spark Master instead of Oozie.

In other words:
* Feed/Stream to be a primitive entity in falcon which declares that there is a continuous
flow of data as per schema and is not bound to any arrival periodicity
* Replication/Mirroring on this would essentially use standard data transport mechanisms to
ship data also on a streaming fashion
* Processes that are defined over these continuous streams are to be orchestrated over an
appropriate engine such as Nimbus (in case of Storm) or similar system. Processes that are
defined in this way also doesn't have periodicity and are continuous. 

This topic requires more conversation before we figure the way forward. Am assuming, more
than one of us are thinking about this.

Regards
Srikanth Sundarrajan

> Date: Wed, 11 Feb 2015 15:30:47 +0530
> Subject: Re: Streaming Feed
> From: sharad@apache.org
> To: dev@falcon.apache.org
> 
> Thanks Jean, this will be quite useful. I am wondering if this will require
> a new partitioning construct in the feed as well like micro-batches, etc.
> 
> Sharad
> 
> On Wed, Feb 11, 2015 at 2:34 PM, Jean-Baptiste Onofré <jb@nanthrax.net>
> wrote:
> 
> > Hi Sharad,
> >
> > I sent an e-mail last week about support of Spark (SparkStreaming) in
> > workflow/process. It's basically very close to what you propose.
> >
> > IMHO, it should be a new impl of workflow or at least the support of a new
> > kind of processes (it's what I have in mind).
> >
> > Regards
> > JB
> >
> >
> > On 02/11/2015 09:38 AM, Sharad Agarwal wrote:
> >
> >> I am looking for a generic schema aware feed construct for streaming
> >> workflow. The schema can be managed by a catalog service like HCatalog.
> >> The
> >> streaming workflow executor would be a system like
> >> Storm/SparkStreaming/Samza.
> >>
> >> I want to know if this is the right thing to be supported in Falcon and if
> >> yes what is the plugging interface for that. Would this be a new
> >> implementation of workflow engine ?
> >>
> >> Thanks
> >> Sharad
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > jbonofre@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message