falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Srikanth Sundarrajan <srik...@hotmail.com>
Subject RE: [DISCUSS] Orchestration in Falcon
Date Mon, 22 Dec 2014 02:40:13 GMT



@Shaik, Items 1 & 6 are solved cleanly by Oozie scheduling capabilities (and in fact the
only ones solved).  Am not too sure if 2 and 7 are solved well. 

To cite a few examples for 2 & 7 which are hard to achieve:

1. First monday of the month
2. Closing day of the month
3. Fixed day of every month (say 7th) {key thing to note here is the non-uniform spacing in
the monthly cycle}
4. At least 6 instances in a day (which is different from saying 1-6 are mandatory and rest
are optional)

@Venkatesh has been pushing to do at least items 1-3 above for eternity.

There is an additional motivation: In delegating control to Oozie for orchestration, Falcon
finds it hard to manage and control pipeline executions. I will try and enumerate them shortly
on the same thread for us to ponder over.

Regards
Srikanth Sundarrajan

> From: psychidris@gmail.com
> Date: Mon, 22 Dec 2014 01:52:04 +0530
> Subject: Re: [DISCUSS] Orchestration in Falcon
> To: dev@falcon.incubator.apache.org
> 
> Hi,
> 
> I agree that oozie has certain limitation with respect to orchestration,
> recently oozie users have raised similar concerns regarding Point 2. (which
> is taken care by falcon by extending Oozie EL not for scheduling but at
> least for consuming the input set)
> 
> Is it not Point 1, 2, 6 and 7 are already solved by using optional input
> mechanism in Falcon? I understand that still users need to specify
> frequency for the process. A few usecases/examples would really help.
> 
> Thanks,
> -Idris
> 
> 
> On Sun, Dec 21, 2014 at 7:43 PM, Srikanth Sundarrajan <sriksun@hotmail.com>
> wrote:
> 
> > Hello Team,
> >
> > Since its inception Falcon has used Oozie for process orchestration as
> > well as feed life cycle phase executions, while this has worked reasonably
> > and allowed to make higher level capabilities available through Falcon, we
> > are increasing seeing scenarios where this is proving to be a limiting
> > factor. In its current form, Falcon relies on Oozie for both scheduling and
> > for workflow execution, due to which the scheduling is limited to time
> > based/cron based scheduling with additional gating conditions on data
> > availability. Also this imposes restrictions on datesets being
> > periodic/cyclic in nature.
> >
> > From an orchestration stand point, it would help if we can support
> > standard gating / scheduling primitives via Falcon:
> >
> > 1. Simple periodic scheduling with no gating conditions
> > 2. Cron based scheduling (day of week, day of the month, specific hours
> > and non-periodic) with no gating conditions
> > 3. Availability of new data (assuming monotonically increasing data
> > version, availavility of new versions)
> > 4. Changes to existing data (reinstatement - similar to late data handling)
> > 5. External trigger/notifications
> > 6. Availability of specific instances of data as declared as mandatory
> > dependency
> > 7. Availability of a minimum subset of instances of data declared as
> > mandatory depedency (at least 10 hourly instances of a day with 24
> > instances for ex)
> > 8. Valid combinations of the above.
> >
> > In this context, I would like to propose that we move away from Oozie for
> > the orchestration requirements and have them implemented natively within
> > Falcon. It will no doubt make Falcon server bulkier and heavier in both
> > code and deployment, but seems like without it, the orchestration within
> > Falcon will be limited by capabilities available within Oozie.
> >
> > Please do note that this suggestion is restricted to the scheduling and
> > not to the workflow execution.
> >
> > Would like to hear from fellow developers and users on what your thoughts
> > are. Please do chime in with your views.
> >
> > Regards
> > Srikanth Sundarrajan
> >

 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message