falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Seetharam Venkatesh <venkat...@innerzeal.com>
Subject Re: [DISCUSS] Orchestration in Falcon
Date Wed, 24 Dec 2014 00:44:06 GMT
Chugging along with Oozie is bad for Falcon in the long run, for users and
developers. Its horribly complex to work through the many rough edges
architecturally in Oozie. Look at all the patches for security that I had
to fix around Oozie. Its unnecessarily very complex, non-uniform and is NOT
meant to be used by another tool like Falcon but was built around end user.

This is a good discussion to have - may be explore oozie for short-term but
look at alternative solutions for the long-term.

On Tue, Dec 23, 2014 at 7:28 AM, Srikanth Sundarrajan <sriksun@hotmail.com>
wrote:

> @jb, There is no doubt merit in mapping them to oozie if possible and if
> extensions are simple and straight forward enough.
>
> Also had a quick chat offline with Shwetha and she mentioned about some
> work happening in Oozie in this regard. On further digging up, found
> https://issues.apache.org/jira/browse/OOZIE-1976. This is possibly what
> Shwetha was referring to. From the looks of it, this tries to address item
> #7 in the original thread.  May be there are more jiras where additional
> work such as a-periodic datasets is being worked on. Perhaps @Shwetha can
> throw some light on what is being considered and/or how these
> gating/orchestration use cases can be managed.
>
> Regards
> Srikanth Sundarrajan
>
> > Date: Tue, 23 Dec 2014 11:06:24 +0100
> > From: jb@nanthrax.net
> > To: dev@falcon.incubator.apache.org
> > Subject: Re: [DISCUSS] Orchestration in Falcon
> >
> > Hi all,
> >
> > I second Shwetha there. I think we can achieve such features in Oozie
> > (with some adaptations).
> >
> > Regards
> > JB
> >
> > Le 2014-12-23 10:53, Shwetha G S a écrit :
> > > If we can get rid of oozie entirely, yes we can explore other
> > > possibilities. But if we are still going to use oozie for DAG
> > > execution, we
> > > are going to add add another bottleneck in the whole
> > > execution(currently,
> > > falcon is not in the workflow execution path) and I don't think its
> > > worth
> > > it.
> > >
> > > The features that are outlined above are all available in basic forms
> > > in
> > > oozie and it should be easy to enhance them/make them as extension
> > > points.
> > >
> > >
> > >
> > > -Shwetha
> > >
> > > On Tue, Dec 23, 2014 at 8:12 AM, Srikanth Sundarrajan
> > > <sriksun@hotmail.com>
> > > wrote:
> > >
> > >> Here are few more gaps that we ought to solve for while we are on the
> > >> subject:
> > >>
> > >> 1. Ability to attach to start & finish events of workflow execution.
> > >> Currently we have post processing hook to listen to finish events, but
> > >> we
> > >> do run into scenarios where there are occasional failures with
> > >> post-processing and there is potential phase lag in learning about the
> > >> events.
> > >> 2. Strict enforcement of concurrency control possibly spanning process
> > >> boundaries.
> > >> 3. Ability to tune how backlogs have to be caught up (old instances to
> > >> be
> > >> given higher priority, newer instances to be given higher priority, or
> > >> some
> > >> sort of weights to allow both to make progress at varying rates).
> > >> There
> > >> have been asks for routing current vs older instances to different
> > >> queues
> > >> by users as an alternative.
> > >> 4. Ability to have a notion of non-time based feed instances and
> > >> related
> > >> coordination.
> > >> 5. Currently keeping track of and managing SLAs is also a challenge,
> > >> but
> > >> with #1 addressed, this might be a lesser concern.
> > >>
> > >> Regards
> > >> Srikanth Sundarrajan
> > >>
> > >> > Subject: Re: [DISCUSS] Orchestration in Falcon
> > >> > From: sriksun@hotmail.com
> > >> > Date: Tue, 23 Dec 2014 06:30:30 +0530
> > >> > To: dev@falcon.incubator.apache.org
> > >> >
> > >> > @venkatesh, the question really is how do we enable these gating pre
> > >> conditions. Seems hard enough to add them to oozie, but am not
> > >> intimately
> > >> familiar with oozie to comment on how hard or easy it is. Like I
> > >> responded
> > >> to @ajay on the same thread, if we are to do away with coordination
> > >> through
> > >> oozie, we can follow up this discussion with approaches and design.
> > >> Though
> > >> I had quartz in my mind, wanted to leave that out of discussion to see
> > >> if
> > >> there is consensus for moving away from oozie coords and implementing
> > >> them
> > >> through other means.
> > >> >
> > >> > Sent from my iPhone
> > >> >
> > >> > > On 23-Dec-2014, at 1:16 am, "Seetharam Venkatesh" <
> > >> venkatesh@innerzeal.com> wrote:
> > >> > >
> > >> > > What is the purpose of this decoupling? Why build this into
> Falcon?
> > >> > > Scheduling is so common that there are dime a dozen schedulers
> today
> > >> and
> > >> > > they are all extensible with custom triggers. Making it part
of
> Falcon
> > >> will
> > >> > > suffer the same issues that Oozie has today.
> > >> > >
> > >> > > I'm sorry but I'm a HUGE -1 to this being built into Falcon
> codebase.
> > >> > >
> > >> > > However, I'm +1 to reusing Quartz scheduler that already exists
-
> > >> stand it
> > >> > > up outside or embed it like we do for active MQ.
> > >> > >
> > >> > > Phase 2 - I'd like to see we write a simple DAG execution layer
in
> > >> YARN as
> > >> > > an app master with out DB and keeps state on HDFS as an alternate
> to
> > >> Oozie.
> > >> > >
> > >> > > Then we will have a nimble falcon which can kick ass.
> > >> > >
> > >> > >
> > >> > > On Sun, Dec 21, 2014 at 6:13 AM, Srikanth Sundarrajan <
> > >> sriksun@hotmail.com>
> > >> > > wrote:
> > >> > >
> > >> > >> Hello Team,
> > >> > >>
> > >> > >> Since its inception Falcon has used Oozie for process
> orchestration as
> > >> > >> well as feed life cycle phase executions, while this has
worked
> > >> reasonably
> > >> > >> and allowed to make higher level capabilities available through
> > >> Falcon, we
> > >> > >> are increasing seeing scenarios where this is proving to
be a
> limiting
> > >> > >> factor. In its current form, Falcon relies on Oozie for both
> > >> scheduling and
> > >> > >> for workflow execution, due to which the scheduling is limited
> to time
> > >> > >> based/cron based scheduling with additional gating conditions
on
> data
> > >> > >> availability. Also this imposes restrictions on datesets
being
> > >> > >> periodic/cyclic in nature.
> > >> > >>
> > >> > >> From an orchestration stand point, it would help if we can
> support
> > >> > >> standard gating / scheduling primitives via Falcon:
> > >> > >>
> > >> > >> 1. Simple periodic scheduling with no gating conditions
> > >> > >> 2. Cron based scheduling (day of week, day of the month,
specific
> > >> hours
> > >> > >> and non-periodic) with no gating conditions
> > >> > >> 3. Availability of new data (assuming monotonically increasing
> data
> > >> > >> version, availavility of new versions)
> > >> > >> 4. Changes to existing data (reinstatement - similar to late
data
> > >> handling)
> > >> > >> 5. External trigger/notifications
> > >> > >> 6. Availability of specific instances of data as declared
as
> mandatory
> > >> > >> dependency
> > >> > >> 7. Availability of a minimum subset of instances of data
> declared as
> > >> > >> mandatory depedency (at least 10 hourly instances of a day
with
> 24
> > >> > >> instances for ex)
> > >> > >> 8. Valid combinations of the above.
> > >> > >>
> > >> > >> In this context, I would like to propose that we move away
from
> Oozie
> > >> for
> > >> > >> the orchestration requirements and have them implemented
natively
> > >> within
> > >> > >> Falcon. It will no doubt make Falcon server bulkier and heavier
> in
> > >> both
> > >> > >> code and deployment, but seems like without it, the orchestration
> > >> within
> > >> > >> Falcon will be limited by capabilities available within Oozie.
> > >> > >>
> > >> > >> Please do note that this suggestion is restricted to the
> scheduling
> > >> and
> > >> > >> not to the workflow execution.
> > >> > >>
> > >> > >> Would like to hear from fellow developers and users on what
your
> > >> thoughts
> > >> > >> are. Please do chime in with your views.
> > >> > >>
> > >> > >> Regards
> > >> > >> Srikanth Sundarrajan
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > Regards,
> > >> > > Venkatesh
> > >> > >
> > >> > > “Perfection (in design) is achieved not when there is nothing
> more to
> > >> add,
> > >> > > but rather when there is nothing more to take away.”
> > >> > > - Antoine de Saint-Exupéry
> > >>
> > >>
>
>



-- 
Regards,
Venkatesh

“Perfection (in design) is achieved not when there is nothing more to add,
but rather when there is nothing more to take away.”
- Antoine de Saint-Exupéry

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message