airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [airflow] jaketf edited a comment on issue #6210: [AIRFLOW-5567] BaseReschedulePokeOperator
Date Sun, 29 Dec 2019 07:23:48 GMT
jaketf edited a comment on issue #6210: [AIRFLOW-5567] BaseReschedulePokeOperator
URL: https://github.com/apache/airflow/pull/6210#issuecomment-569481697
 
 
   I think the conversation here has been educational and thanks all for chiming in.
   Apologies for being MIA on this for so long.
   However, during the time away I've reflected on:
   What exactly is our problem statement? Originally I set out to: "Provide a primitive to
construct operators that allow retry / failing of starting a long running external job without
blocking a worker for the entire duration of that long running job" For this, We might consider
some very different approach:
   Instead of thinking about this as a rescheduling pokes problem, think of it as special
kind of SubDag pattern we want to better support: achieve the desired behavior by having a
start task and a rescheduling pokes sensor for completion task inside a SubDag. (This might
be DOA / short-sighted as I think SubDagOperator task actually blocks a worker to monitor
the SubDag completion). But could we perhaps refocus this effort on improving how SubDags
monitor for completion? 
   
   However, there was some discussion that adding support for stateful tasks in airflow would
have broader impact than just this rescheduling case.
   
   From my read through all these threads it seems the key open questions for the rescheduling
approach similar to this PR are:
   - Per @JonnyIncognito  we need to get a strong consensus around scope of idempotency. Does
each (rescheduled) task instance have to be idempotent, or does a task need to be idempotent
before succeeding or failing?
   - Where should rescheduling logic lie in the class structure? (it seems like it should
be moved to BaseOperator).
   - Is a TaskState model/table preferable to the changes it would take for XCom? and what
other use cases should be considered in its schema??
   - Should we explore persisting information in the context object.
   
   I'm not that familiar w/ AIP process and if it's something a small group of people from
the community get aligned on in a meeting/slack before filing or something and individual
just proposes.
   @Fokko @JonnyIncognito @dstandish @mik-laj you all seem to have vested interest in this
would you be open to scheduling a meeting or some time to discuss synchronously over slack?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message