airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Tallman <...@apigee.com>
Subject Re: dagrun_timeout not honoured?
Date Tue, 21 Jun 2016 23:53:21 GMT
Maxime -

Wish I did have time... BUT, I can say that SLA timeouts will fail (and
error) on any dag with schedule=None because they depend on the Now() being
less than the next scheduled date (None)...

dttm = dag.following_schedule(dttm)
while dttm < datetime.now():

I will file a bug...


On Tue, Jun 21, 2016 at 11:26 AM Maxime Beauchemin <
maximebeauchemin@gmail.com> wrote:

> A tangent here: for people who have the knowledge (and a bit of time on
> their hand), providing a failing unit test can help the core committers
> with an easy way to jump in to help.
>
> I always wonder whether I'll be able reproduce the bug, is it version
> specific? environment specific? is it based on bad assumptions?
>
> With a failing unit test it's really clear what the expectations are and it
> makes it really easy for people can just jump in and fix it.
>
> Thanks,
>
> Max
>
> On Tue, Jun 21, 2016 at 9:15 AM, Ben Tallman <ben@apigee.com> wrote:
>
> > We have seen this too. Running 1.7.0 with Celery, neither DAG timeout nor
> > individual task sla's seem to be honored. In truth, we haven't done a lot
> > of testing, as it is more important that we get our overall ETL migrated
> > with workarounds.
> >
> > However, we will be digging in at some point for greater clarity...
> >
> > On Thu, Jun 16, 2016 at 11:21 AM harish singh <harish.singh22@gmail.com>
> > wrote:
> >
> > > Hi guys,
> > >
> > > Since we have "dag_conurrency" restriction, I tried to play with
> > >  dagrun_timeout.
> > > So that after some interval, dag runs are marked failed and pipeline
> > > progresses.
> > > But this is not happening.
> > >
> > > I have this dag (@hourly):
> > >
> > > A -> B -> C  -> D -> E
> > >
> > > C: depends_on_past=true
> > >
> > > My dagrun_timeout is 60 minutes
> > >
> > > default_args = {
> > >     'owner': 'airflow',
> > >     'depends_on_past': False,
> > >     'start_date': scheduling_start_date,
> > >     'email': ['airflow@airflow.com'],
> > >     'email_on_failure': False,
> > >     'email_on_retry': False,
> > >     'retries': 2,
> > >     'retry_delay': default_retries_delay,
> > >     'dagrun_timeout':datetime.timedelta(minutes=60)
> > > }
> > >
> > >
> > > Parallelism setting in airflow.cfg:
> > >
> > > parallelism = 8
> > > dag_concurrency = 8
> > > max_active_runs_per_dag = 8
> > >
> > >
> > > For hour 1, all the tasks got completed.
> > > Now in hour 2,  say task C failed.
> > >
> > > From hour 3 onwards, Tasks A and B keep running.
> > > Task C never triggers because it depends on past (and past hour failed)
> > >
> > > Since dag conurrency is 8, my pipeline progresses from hour 3 to hour
> 10
> > > (thats next 8 hours) for Tasks A and B. After this, pipeline stalls.
> > >
> > > "dagrun_timeout" was 60 minutes.  This should mean that after 60
> minutes,
> > > from hour 3 onwards, the DAG runs that has been up for more than 60
> > minutes
> > > should be marked FAILED  and the pipeline should progress?
> > >
> > > But this is not happening. So I am guessing my understanding here is
> not
> > > correct.
> > >
> > > What should be behavior when we use  "dagrun_timeout" ?
> > > Also, how can I make sure that the dag proceeds in this situation?
> > >
> > > In the example I gave above, Task A and B should keep running every
> hour
> > > (since it doesnt depend on past).
> > > Why it runs 8(dag_conurrency) instances  and stalls?
> > >
> > >
> > > Thanks,
> > > Harish
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message