airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From George Leslie-Waksman <waks...@gmail.com>
Subject Re: error handling flow in DAG
Date Wed, 10 Oct 2018 17:57:18 GMT
This is a great use case for the all_success and one_failed trigger rules.

If we have "--S-->" be a dependency where the downstream has
all_success trigger rule, and "--F-->" be a dependency where the
downstream has one_failed as the trigger rule, you can do what you
want with a DAG of the form:

task_1 --F--> task_1_failure_a --?--> task_1_failure_b
  \--S--> task_2
               \--S--> task_3 --F--> task_3_failure_a --?--> task_3_failure_b

(please pardon the mediocre asking diagram and I hope it made it
through the wire correctly)

--George


On Mon, Oct 8, 2018 at 12:14 PM James Meickle
<jmeickle@quantopian.com.invalid> wrote:
>
> Anthony:
>
> Could you just have the "success" path be declared with "all_success" (the
> default), and the "failure" side branches be declared with "all_failed"
> depending on the previous task? This will have the same branching structure
> you want but with less intermediary operators.
>
> -James M.
>
> On Mon, Oct 1, 2018 at 1:12 PM Anthony Brown <anthony.brown@johnlewis.co.uk>
> wrote:
>
> > Hi
> >    I am coding various data flows and one of the requirements we have is to
> > have some error tasks happen when some of the tasks failure. These error
> > tasks are specific to the task that failed and are not a generic to the
> > whole DAG
> >
> >    So for instance if I have a DAG that runs the following tasks
> >
> > task_1 ----> task_2 ----> task_3
> >
> >    If task_1 fails, then I want to run
> >
> > task_1_failure_a ---> task_1_failure_b
> >
> >    If task_2 fails, I do not need to do anything specific, but if task_3
> > fails, I need to run
> >
> > task_3_failure_a ---> task_3_failure_b
> >
> >    I already have a generic on_failure_callback task defined on all tasks
> > that handles alerting, but am stuck on the best way of handling a failure
> > flow for tasks
> >
> >    The ways I have come up with of handling this are
> >
> > Have a branch operator between each task with trigger_rule set to all_done.
> > The branch operator would then decide whether to go to next (success) task,
> > or to go down the failure branch
> >
> > Put the failure tasks in a separate DAG with no schedule. Have a different
> > on_failure_callback for each task that would trigger the failure DAG for
> > that task and then do my generic error handling
> >
> >    Does anybody have any thoughts on which of the above two approaches
> > would be best, or suggest an alternative way of doing this
> >
> > Thanks
> >
> > --
> > --
> >
> > Anthony Brown
> > Data Engineer BI Team - John Lewis
> > Tel : 0787 215 7305
> > **********************************************************************
> > This email is confidential and may contain copyright material of the John
> > Lewis Partnership.
> > If you are not the intended recipient, please notify us immediately and
> > delete all copies of this message.
> > (Please note that it is your responsibility to scan this message for
> > viruses). Email to and from the
> > John Lewis Partnership is automatically monitored for operational and
> > lawful business reasons.
> > **********************************************************************
> >
> > John Lewis plc
> > Registered in England 233462
> > Registered office 171 Victoria Street London SW1E 5NN
> >
> > Websites: https://www.johnlewis.com
> > http://www.waitrose.com
> > https://www.johnlewisfinance.com
> > http://www.johnlewispartnership.co.uk
> >
> > **********************************************************************
> >

Mime
View raw message