airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bolke de Bruin <bdbr...@gmail.com>
Subject Re: Experiences with 1.8.0
Date Fri, 20 Jan 2017 22:55:02 GMT
Will do. And thanks.

Adding another issue: 

* Some of our DAGs are not getting scheduled for some unknown reason.
Need to investigate why.

Related but not root cause:
* Logging is so chatty that it gets really hard to find the real issue

Bolke.

> On 20 Jan 2017, at 23:45, Dan Davydov <dan.davydov@airbnb.com.INVALID> wrote:
> 
> I'd be happy to lend a hand fixing these issues and hopefully some others
> are too. Do you mind creating jiras for these since you have the full
> context? I have created a JIRA for (1) and have assigned it to myself:
> https://issues.apache.org/jira/browse/AIRFLOW-780
> 
> On Fri, Jan 20, 2017 at 1:01 AM, Bolke de Bruin <bdbruin@gmail.com> wrote:
> 
>> This is to report back on some of the (early) experiences we have with
>> Airflow 1.8.0 (beta 1 at the moment):
>> 
>> 1. The UI does not show faulty DAG, leading to confusion for developers.
>> When a faulty dag is placed in the dags folder the UI would report a
>> parsing error. Now it doesn’t due to the separate parising (but not
>> reporting back errors)
>> 
>> 2. The hive hook sets ‘airflow.ctx.dag_id’ in hive
>> We run in a secure environment which requires this variable to be
>> whitelisted if it is modified (needs to be added to UPDATING.md)
>> 
>> 3. DagRuns do not exist for certain tasks, but don’t get fixed
>> Log gets flooded without a suggestion what to do
>> 
>> 4. At start up all running dag_runs are being checked, we seemed to have a
>> lot of “left over” dag_runs (couple of thousand)
>> - Checking was logged to INFO -> requires a fsync for every log message
>> making it very slow
>> - Checking would happen at every restart, but dag_runs’ states were not
>> being updated
>> - These dag_runs would never er be marked anything else than running for
>> some reason
>> -> Applied work around to update all dag_run in sql before a certain date
>> to -> finished
>> -> need to investigate why dag_runs did not get marked “finished/failed”
>> 
>> 5. Our umask is set to 027
>> 
>> 


Mime
View raw message