airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Froetscher <rfroetsc...@lumoslabs.com>
Subject Re: What is the best way to retry an entire DAG instead of just a single task?
Date Wed, 26 Oct 2016 20:38:31 GMT
Sure, I'd be happy to.

On Wed, Oct 26, 2016 at 1:13 AM, siddharth anand <sanand@apache.org> wrote:

> Rob,
> Would you mind writing up a blog about this and sharing it with the list?
> To be honest, many of the committers themselves have little to no
> experience with subdags. It's an area few people are familiar with.
>
> -s
>
> On Mon, Oct 24, 2016 at 11:08 AM, Rob Froetscher <
> rfroetscher@lumoslabs.com>
> wrote:
>
> > Just following up here. But we have found that SubDags can be another way
> > to do this.
> >
> > We have had a few problems due to SubDags using the backfill execution
> > style (if there aren't workers available immediately to pick up these
> tasks
> > then it counts as "deadlocked").
> >
> > However, for the most part, SubDags are an effective way to group tasks
> > into chunks of behavior that should be treated itself as a task, with
> > retry, etc.
> >
> > On Tue, Aug 2, 2016 at 8:37 PM, siddharth anand <sanand@apache.org>
> wrote:
> >
> > > Hi Arthur,
> > > It's not a dumb question. We don't have the ability to retry a DAG
> based
> > on
> > > a task failure in a programatic way to the best of my knowledge. Also,
> we
> > > don't allow cyclic dependencies.. hence the Acyclic part of DAG.
> > >
> > > TriggerDagRunOperator won't work because the execution is async. The
> TDRO
> > > will return as soon as it inserts an entry into the DagRun table.  The
> > > scheduler will then read the DagRun entry and start scheduling tasks in
> > > your DAG. The controller DAG does not know about the success or failure
> > of
> > > the target DAG..
> > >
> > > The simplest solution is to wrap all of this into a PythonOperator's
> > > callable, so that the checking and restarting of the job is happening
> in
> > > good old python code. You would not leverage Airflow's DAG executors
> for
> > > this.
> > >
> > > The harder solution is implement this feature in some way in Airflow.
> > > Essentially, a new task level parameter called :
> > retry_dag_on_task_failure.
> > >
> > > -s
> > >
> > > On Tue, Aug 2, 2016 at 5:50 PM, Wang Yajun <kwin.wang@gmail.com>
> wrote:
> > >
> > > > Arthur Purvis
> > > >
> > > > u can try:
> > > >
> > > > 1. airflow backfill -s start_date -e end_date your_DAG
> > > > 2. airflow trigger_dag your_DAG
> > > >
> > > > u can see the detail information by the official document
> > > >
> > > > hope to help u
> > > > Arthur Purvis <apurvis@lumoslabs.com>于2016年8月2日 周二下午10:55写道:
> > > >
> > > > > Apologies if this is a dumb question, but I'm looking for a way to
> > > retry
> > > > an
> > > > > entire DAG if a single task fails, rather than retry just that
> task.
> > > > >
> > > > > The context is that of a job starter + sensor, and if the sensor
> > fails
> > > it
> > > > > means the job needs to be restarted, not just re-sensored.
> > > > >
> > > > > From reading the documentation it seems like TriggerDagRunOperator
> > > could
> > > > be
> > > > > cajoled into doing this, but it feels a little... clunky.  Is this
> > > indeed
> > > > > the best way to restart an entire DAG when one of the tasks fails?
> > > > >
> > > > > Thanks.
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message