airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bolke de Bruin <bdbr...@gmail.com>
Subject Experiences with 1.8.0 (updated)
Date Fri, 20 Jan 2017 09:07:05 GMT
— continued accidentally pressed send —

This is to report back on some of the (early) experiences we have with Airflow 1.8.0 (beta
1 at the moment):

1. The UI does not show faulty DAG, leading to confusion for developers. 
When a faulty dag is placed in the dags folder the UI would report a parsing error. Now it
doesn’t due to the separate parising (but not reporting back errors)

2. The hive hook sets ‘airflow.ctx.dag_id’ in hive
We run in a secure environment which requires this variable to be whitelisted if it is modified
(needs to be added to UPDATING.md)

3. DagRuns do not exist for certain tasks, but don’t get fixed
Log gets flooded without a suggestion what to do

4. At start up all running dag_runs are being checked, we seemed to have a lot of “left
over” dag_runs (couple of thousand)
- Checking was logged to INFO -> requires a fsync for every log message making it very
slow
- Checking would happen at every restart, but dag_runs’ states were not being updated
- These dag_runs would never er be marked anything else than running for some reason
-> Applied work around to update all dag_run in sql before a certain date to -> finished
-> need to investigate why dag_runs did not get marked “finished/failed” 

5. Our umask is set to 027, but scheduler logging directories were created 777
- Cannot reproduce this locally, so we need to investigate.

6. Scanning the DAG dir only every 5 minutes by default seems very slow in more “dev/prod”
mixed environments
-> Default should be set lower (30s) with best practice for prod environments set to maybe
300s


That’s it for now. Nothing really a show stopper I guess, but #4 is something we need to
take care of. Rest can be fixed with small updates or good documentation.

Will release Beta 2 today, that will contain the major feature of cgroups+impersonation, but
will not contain fixes yet for the above.

Bolke



Mime
View raw message