airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Standish <dpstand...@gmail.com>
Subject Re: DAG "Schedule Filter Callback"?
Date Fri, 30 Aug 2019 19:11:58 GMT
>
> Making "prev" and "next" variables useless.


With this approach, your "working" dag should not use prev_ or next_.  It
would have two options to determine what it's supposed to do: use
execution_date if that's enough, or it could use dag_run.conf otherwise.
The python callable that drives trigger dag can return a payload that gets
passed to dag_run.conf.  Dag_run.conf can be referenced in your working dag
in a templated field.  So you can get arbitrary information into your
triggered dag.  E.g. "from_date" and "to_date".  Or 3, i guess: xcom.


depends_on_past

🤷‍♀️Maybe this an issue.  Maybe there should be an option for triggered
dags to respect this param.  Your "trigger" dag could respect it though.
And it sounds like in your case that could be enough -- like each trigger
dag would trigger no more than 1 working dag run.

Anyway just offering it up in case this approach was not on your radar.




On Fri, Aug 30, 2019 at 11:54 AM Shaw, Damian P. <
damian.shaw.2@credit-suisse.com> wrote:

> I believe TriggerDagRunOperator solves neither 1 or 2.
>
> For 1) The "depends_on_past" logic seems tenuous when DAGs are trigged
> like this but I could be wrong?
>
> For 2) Many of the tasks still need to know what the next or previous
> execution date is. As I understand it the TriggerDagRunOperator creates a
> DAG Runs with the "external_trigger" flag, this forces the
> prev_execution_date and next_execution_date to be the same as the execution
> date as per this line of code:
>
> https://github.com/apache/airflow/blob/7a59358ffde269701af2121246ac54f1a5cbe785/airflow/models/taskinstance.py#L1129
> .
>
> Making "prev" and "next" variables useless.
>
> Damian
>
> -----Original Message-----
> From: Daniel Standish [mailto:dpstandish@gmail.com]
> Sent: Friday, August 30, 2019 2:43 PM
> To: dev@airflow.apache.org
> Subject: Re: DAG "Schedule Filter Callback"?
>
> Have you considered using TriggerDagOperator?
>
> One way to deal with this kind of thing is to have two dags:
>
>    - "working dag" - This dag does the work. Its behavior is governed by
>    execution_date / dag_run.conf.
>    - "trigger dag" - This dag just triggers the "working" dag, with
>    appropriate execution_date / conf, under the appropriate circumstances.
>
> This lets you separate the convoluted scheduling logic from the actual work
> to be done.
>
> So e.g. on a Monday you could trigger 3 dag runs: one for Friday, one for
> Sat, one for Sun.  Or you could trigger 1 with a dag conf that specifies
> which time range to handle.
>
>
>
> On Fri, Aug 30, 2019 at 11:16 AM Shaw, Damian P. <
> damian.shaw.2@credit-suisse.com> wrote:
>
> > My proposal is to have it at the DAG level rather than the operator level
> > as it means you don't have to deal with "skipped" behavior at all, simply
> > the DAG Run for a date you don't want it to be scheduled on does not
> exist.
> > In the same way that if you currently cron schedule for Monday to Friday,
> > the Saturday to Sunday DAG Run does not exist.
> >
> > Therefore "next" and "prev" macros fundamental behavior remains the same,
> > they schedule for the next execution date or the prev execution date,
> there
> > is no need to worry about "skipped" vs not-"skipped".
> >
> > In the financial world some schedules are simply not deterministic,
> > holiday dates get by governments announced and changed by governments
> > overtime, sometimes at very short notice. I agree this should have a
> > warning though.
> >
> > Damian
> >
> > -----Original Message-----
> > From: Kaxil Naik [mailto:kaxilnaik@gmail.com]
> > Sent: Friday, August 30, 2019 2:06 PM
> > To: dev@airflow.apache.org
> > Subject: Re: DAG "Schedule Filter Callback"?
> >
> > We can have a flag `depends_on_past_allow_skipped_state` or something
> > similar that can take care of your 1st issue.
> >
> > On Fri, Aug 30, 2019 at 6:17 PM Shaw, Damian P. <
> > damian.shaw.2@credit-suisse.com> wrote:
> >
> > > Hi all,
> > >
> > > After discussion at the NY Meetup this week I've been pondering how
> > > Airflow could support custom schedules with very little change to core
> > > Airflow logic and keeping backwards compatibility.
> > >
> > > As I understand the common way to support custom schedules is through a
> > > BranchOperator. You provide logic that on a good date executes the
> "run"
> > > branch and on another date runs the "don't run" branch which usually
> is a
> > > dummy operator.
> > >
> > > There are 2 problems associated with it which would be useful to me
> (and
> > I
> > > think the rest of the community) to solve:
> > >
> > > 1.       depends_on_past does not play well with branching, because the
> > > "run" branch tasks get marked as "skipped"
> > >
> > > 2.       Template variables like "prev_ds" and "next_ds" represent the
> > > underlying schedule and not the actual schedule you are working on
> > >
> > > I therefore propose a "schedule_filter_callback", a function which you
> > > provide at DAG creation time that takes in some arguments (execution
> > date,
> > > timezone, DAG?), and returns a Truthy or Falsy result based on if this
> > is a
> > > good date to execute on. If schedule_filter_callback is None then the
> > > current schedule logic is applied.
> > >
> > > I appreciate this is a fairly significant proposal, but it seems like
> > > because it would just be 1 extra argument on the DAG and make no change
> > to
> > > the default behavior it doesn't quite rise to the level of AIP? Sorry
> if
> > > this has already been discussed before.
> > >
> > > Regards,
> > > Damian
> > >
> > >
> > >
> >
> ===============================================================================
> > >
> > > Please access the attached hyperlink for an important electronic
> > > communications disclaimer:
> > > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> > >
> >
> ===============================================================================
> > >
> > >
> >
> >
> >
> >
> ===============================================================================
> >
> > Please access the attached hyperlink for an important electronic
> > communications disclaimer:
> > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> >
> ===============================================================================
> >
> >
>
>
>
> ===============================================================================
>
> Please access the attached hyperlink for an important electronic
> communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> ===============================================================================
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message