airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremiah Lowin <jlo...@apache.org>
Subject Re: Merging #1514 AIRFLOW-128
Date Wed, 01 Jun 2016 10:13:12 GMT
Just to be clear this is a highly unlikely event. I used to have a unit
test for it but got rid of it when we closed bugs that made it impossible
to cause such a crash deterministically. So this situation is possible but
almost certainly won't manifest.

On Wed, Jun 1, 2016 at 4:00 AM Bolke de Bruin <bdbruin@gmail.com> wrote:

> Hey,
>
> This is to give a heads up that I am planning to merge #1514, the refactor
> of process_dag, today. This is the second step in executing on the
> scheduler roadmap. It has been running in our production for a week now
> with no functional differences. Scheduler loop times start a bit higher,
> but have a lower max. Amount of connections to the database is round 1/3 of
> the previous scheduler (test dag went from 150 connections to 50). Database
> load slightly lower.
>
> While fixing many issues (race conditions), a corner case mentioned by
> Jeremiah is now present. A TI is sent in SCHEDULED state to the executor.
> The executor fails in loading the TI then the TI might be orphaned forever.
> As fixing the corner case will require further fundamental changes we
> discussed it should be addressed in a follow up patch.
>
> My planned next steps are 1) reduce scheduler loop time to around 1s by
> making task reporting “event driven”. 2) auto-align start date 3) add
> notion of “previous” to dagrun 4) fix corner case mentioned above.
>
> - Bolke
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message