falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sharad Agarwal <sha...@apache.org>
Subject Re: Streaming Feed
Date Fri, 13 Feb 2015 03:53:14 GMT
Sounds good. I think we should talk more about this in our next
contributor's meetup.


On Thu, Feb 12, 2015 at 9:20 PM, Srikanth Sundarrajan <sriksun@hotmail.com>
wrote:

> This is been an idea lingering in my mind for a while. I will be very
> supportive of any effort to create a stream abstract similar in lines with
> feed (or this may not be required, if we do a major overhaul with respect
> to orchestration in falcon, where tight requirement of feed having a
> frequency is done away with) and have process work with these streams. In
> which case the orchestration should happen through Nimbus or Spark Master
> instead of Oozie.
>
> In other words:
> * Feed/Stream to be a primitive entity in falcon which declares that there
> is a continuous flow of data as per schema and is not bound to any arrival
> periodicity
> * Replication/Mirroring on this would essentially use standard data
> transport mechanisms to ship data also on a streaming fashion
> * Processes that are defined over these continuous streams are to be
> orchestrated over an appropriate engine such as Nimbus (in case of Storm)
> or similar system. Processes that are defined in this way also doesn't have
> periodicity and are continuous.
>
> This topic requires more conversation before we figure the way forward. Am
> assuming, more than one of us are thinking about this.
>
> Regards
> Srikanth Sundarrajan
>
> > Date: Wed, 11 Feb 2015 15:30:47 +0530
> > Subject: Re: Streaming Feed
> > From: sharad@apache.org
> > To: dev@falcon.apache.org
> >
> > Thanks Jean, this will be quite useful. I am wondering if this will
> require
> > a new partitioning construct in the feed as well like micro-batches, etc.
> >
> > Sharad
> >
> > On Wed, Feb 11, 2015 at 2:34 PM, Jean-Baptiste Onofré <jb@nanthrax.net>
> > wrote:
> >
> > > Hi Sharad,
> > >
> > > I sent an e-mail last week about support of Spark (SparkStreaming) in
> > > workflow/process. It's basically very close to what you propose.
> > >
> > > IMHO, it should be a new impl of workflow or at least the support of a
> new
> > > kind of processes (it's what I have in mind).
> > >
> > > Regards
> > > JB
> > >
> > >
> > > On 02/11/2015 09:38 AM, Sharad Agarwal wrote:
> > >
> > >> I am looking for a generic schema aware feed construct for streaming
> > >> workflow. The schema can be managed by a catalog service like
> HCatalog.
> > >> The
> > >> streaming workflow executor would be a system like
> > >> Storm/SparkStreaming/Samza.
> > >>
> > >> I want to know if this is the right thing to be supported in Falcon
> and if
> > >> yes what is the plugging interface for that. Would this be a new
> > >> implementation of workflow engine ?
> > >>
> > >> Thanks
> > >> Sharad
> > >>
> > >>
> > > --
> > > Jean-Baptiste Onofré
> > > jbonofre@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message