falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Srikanth Sundarrajan <srik...@hotmail.com>
Subject RE: [DISCUSS] Orchestration in Falcon
Date Mon, 22 Dec 2014 04:05:16 GMT
@Ajay,

> Do we intend to write our own scheduler from scratch or extend/move to
> another one?

This I guess is something for all of us to discuss at a later time if there is wide consensus
and willingness to move away from the current implementation.

Regards
Srikanth Sundarrajan

> From: ajaynsit@gmail.com
> Date: Mon, 22 Dec 2014 08:44:24 +0530
> Subject: Re: [DISCUSS] Orchestration in Falcon
> To: dev@falcon.incubator.apache.org
> 
> +1
> 
> This gives us more flexibility and control. I have also seen following
> other capabilities in scheduling systems. We can see if any of these make
> sense for us.
> 
>    1. NOT scheduling on certain days listed in a registered calendar (e.g.
>    Business Holidays)
>    2. Repeated with delayed intervals
> 
> 
> Do we intend to write our own scheduler from scratch or extend/move to
> another one?
> 
> 
> On Mon, Dec 22, 2014 at 8:10 AM, Srikanth Sundarrajan <sriksun@hotmail.com>
> wrote:
> 
> >
> >
> >
> > @Shaik, Items 1 & 6 are solved cleanly by Oozie scheduling capabilities
> > (and in fact the only ones solved).  Am not too sure if 2 and 7 are solved
> > well.
> >
> > To cite a few examples for 2 & 7 which are hard to achieve:
> >
> > 1. First monday of the month
> > 2. Closing day of the month
> > 3. Fixed day of every month (say 7th) {key thing to note here is the
> > non-uniform spacing in the monthly cycle}
> > 4. At least 6 instances in a day (which is different from saying 1-6 are
> > mandatory and rest are optional)
> >
> > @Venkatesh has been pushing to do at least items 1-3 above for eternity.
> >
> > There is an additional motivation: In delegating control to Oozie for
> > orchestration, Falcon finds it hard to manage and control pipeline
> > executions. I will try and enumerate them shortly on the same thread for us
> > to ponder over.
> >
> > Regards
> > Srikanth Sundarrajan
> >
> > > From: psychidris@gmail.com
> > > Date: Mon, 22 Dec 2014 01:52:04 +0530
> > > Subject: Re: [DISCUSS] Orchestration in Falcon
> > > To: dev@falcon.incubator.apache.org
> > >
> > > Hi,
> > >
> > > I agree that oozie has certain limitation with respect to orchestration,
> > > recently oozie users have raised similar concerns regarding Point 2.
> > (which
> > > is taken care by falcon by extending Oozie EL not for scheduling but at
> > > least for consuming the input set)
> > >
> > > Is it not Point 1, 2, 6 and 7 are already solved by using optional input
> > > mechanism in Falcon? I understand that still users need to specify
> > > frequency for the process. A few usecases/examples would really help.
> > >
> > > Thanks,
> > > -Idris
> > >
> > >
> > > On Sun, Dec 21, 2014 at 7:43 PM, Srikanth Sundarrajan <
> > sriksun@hotmail.com>
> > > wrote:
> > >
> > > > Hello Team,
> > > >
> > > > Since its inception Falcon has used Oozie for process orchestration as
> > > > well as feed life cycle phase executions, while this has worked
> > reasonably
> > > > and allowed to make higher level capabilities available through
> > Falcon, we
> > > > are increasing seeing scenarios where this is proving to be a limiting
> > > > factor. In its current form, Falcon relies on Oozie for both
> > scheduling and
> > > > for workflow execution, due to which the scheduling is limited to time
> > > > based/cron based scheduling with additional gating conditions on data
> > > > availability. Also this imposes restrictions on datesets being
> > > > periodic/cyclic in nature.
> > > >
> > > > From an orchestration stand point, it would help if we can support
> > > > standard gating / scheduling primitives via Falcon:
> > > >
> > > > 1. Simple periodic scheduling with no gating conditions
> > > > 2. Cron based scheduling (day of week, day of the month, specific hours
> > > > and non-periodic) with no gating conditions
> > > > 3. Availability of new data (assuming monotonically increasing data
> > > > version, availavility of new versions)
> > > > 4. Changes to existing data (reinstatement - similar to late data
> > handling)
> > > > 5. External trigger/notifications
> > > > 6. Availability of specific instances of data as declared as mandatory
> > > > dependency
> > > > 7. Availability of a minimum subset of instances of data declared as
> > > > mandatory depedency (at least 10 hourly instances of a day with 24
> > > > instances for ex)
> > > > 8. Valid combinations of the above.
> > > >
> > > > In this context, I would like to propose that we move away from Oozie
> > for
> > > > the orchestration requirements and have them implemented natively
> > within
> > > > Falcon. It will no doubt make Falcon server bulkier and heavier in both
> > > > code and deployment, but seems like without it, the orchestration
> > within
> > > > Falcon will be limited by capabilities available within Oozie.
> > > >
> > > > Please do note that this suggestion is restricted to the scheduling and
> > > > not to the workflow execution.
> > > >
> > > > Would like to hear from fellow developers and users on what your
> > thoughts
> > > > are. Please do chime in with your views.
> > > >
> > > > Regards
> > > > Srikanth Sundarrajan
> > > >
> >
> >
> >
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message