airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (AIRFLOW-2895) Prevent scheduler from spamming heartbeats/logs
Date Sun, 02 Sep 2018 18:08:03 GMT

     [ https://issues.apache.org/jira/browse/AIRFLOW-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned AIRFLOW-2895:
-------------------------------------

    Assignee: Dan Davydov  (was: Holden Karau's magical unicorn)

> Prevent scheduler from spamming heartbeats/logs
> -----------------------------------------------
>
>                 Key: AIRFLOW-2895
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2895
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>            Reporter: Dan Davydov
>            Assignee: Dan Davydov
>            Priority: Major
>
> There seems to be a couple of problems with [https://github.com/apache/incubator-airflow/pull/2986]
that cause the sleep to not trigger and Scheduler heartbeating/logs to be spammed:
>  # If all of the files are being processed in the queue, there is no sleep (can be fixed
by sleeping for min_sleep even if there are no files)
>  # I have heard reports that some files can return a parsing time that is monotonically
increasing for some reason (e.g. file actually parses in 1s each loop, but the reported duration
seems to use the very time the file was parsed as the start time instead of the last time),
I haven't confirmed this but it sounds problematic.
> To unblock the release I'm reverting this PR for now. It should be re-added with tests/mocking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message