airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Froetscher <rfroetsc...@lumoslabs.com>
Subject Re: What is the best way to retry an entire DAG instead of just a single task?
Date Mon, 24 Oct 2016 18:08:08 GMT
Just following up here. But we have found that SubDags can be another way
to do this.

We have had a few problems due to SubDags using the backfill execution
style (if there aren't workers available immediately to pick up these tasks
then it counts as "deadlocked").

However, for the most part, SubDags are an effective way to group tasks
into chunks of behavior that should be treated itself as a task, with
retry, etc.

On Tue, Aug 2, 2016 at 8:37 PM, siddharth anand <sanand@apache.org> wrote:

> Hi Arthur,
> It's not a dumb question. We don't have the ability to retry a DAG based on
> a task failure in a programatic way to the best of my knowledge. Also, we
> don't allow cyclic dependencies.. hence the Acyclic part of DAG.
>
> TriggerDagRunOperator won't work because the execution is async. The TDRO
> will return as soon as it inserts an entry into the DagRun table.  The
> scheduler will then read the DagRun entry and start scheduling tasks in
> your DAG. The controller DAG does not know about the success or failure of
> the target DAG..
>
> The simplest solution is to wrap all of this into a PythonOperator's
> callable, so that the checking and restarting of the job is happening in
> good old python code. You would not leverage Airflow's DAG executors for
> this.
>
> The harder solution is implement this feature in some way in Airflow.
> Essentially, a new task level parameter called : retry_dag_on_task_failure.
>
> -s
>
> On Tue, Aug 2, 2016 at 5:50 PM, Wang Yajun <kwin.wang@gmail.com> wrote:
>
> > Arthur Purvis
> >
> > u can try:
> >
> > 1. airflow backfill -s start_date -e end_date your_DAG
> > 2. airflow trigger_dag your_DAG
> >
> > u can see the detail information by the official document
> >
> > hope to help u
> > Arthur Purvis <apurvis@lumoslabs.com>于2016年8月2日 周二下午10:55写道:
> >
> > > Apologies if this is a dumb question, but I'm looking for a way to
> retry
> > an
> > > entire DAG if a single task fails, rather than retry just that task.
> > >
> > > The context is that of a job starter + sensor, and if the sensor fails
> it
> > > means the job needs to be restarted, not just re-sensored.
> > >
> > > From reading the documentation it seems like TriggerDagRunOperator
> could
> > be
> > > cajoled into doing this, but it feels a little... clunky.  Is this
> indeed
> > > the best way to restart an entire DAG when one of the tasks fails?
> > >
> > > Thanks.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message