airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-1937) Dont commit newly created task instances one by one
Date Sat, 23 Dec 2017 08:42:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16302278#comment-16302278
] 

ASF subversion and git services commented on AIRFLOW-1937:
----------------------------------------------------------

Commit cc62fd064a02f7d14d2a0e03466b8bd424fd6f30 in incubator-airflow's branch refs/heads/master
from [~bolke]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=cc62fd0 ]

[AIRFLOW-1937] Speed up scheduling by committing in batch

Newly scheduled task instances (state = None, up
for retry)
were committed per task instance instead of all at
once.
This isn't required as tasks cannot be picked up
by another
process in the mean time. Committing in batch
significantly
speeds up task scheduling for dags that have a lot
of tasks.

Closes #2888 from bolkedebruin/AIRFLOW-1937


> Dont commit newly created task instances one by one
> ---------------------------------------------------
>
>                 Key: AIRFLOW-1937
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1937
>             Project: Apache Airflow
>          Issue Type: Improvement
>            Reporter: Bolke de Bruin
>
> For large dags the amount of tasks that need to be committed to the database can be quite
large. Comitting every task instance puts quite a load on the db, which is not required as
we can assume these are newly scheduled task instances.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message