airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luke Maycock (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (AIRFLOW-642) Add dag_run to the task_instance table or create new taskuuid column and use this to uniquely identify a task
Date Thu, 24 Nov 2016 15:30:58 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15693512#comment-15693512
] 

Luke Maycock edited comment on AIRFLOW-642 at 11/24/16 3:30 PM:
----------------------------------------------------------------

+1 

We also find this to be an issue. This issue is similar to https://issues.apache.org/jira/browse/AIRFLOW-70
but the use of a numeric unique identifier is preferable to a string such as dag_run_id. 

Also, it makes sense to address the relationship between xcoms and dag runs using similar
logic.  We believe this would have to happen as part of this work otherwise xcoms will not
continue to work. 


was (Author: lukem):
+1 

We also find this to be an issue. This issue is similar to https://issues.apache.org/jira/browse/AIRFLOW-70
but the use of a numeric unique identifier is preferable to a string such as dag_run_id. 

Also, it makes sense to address the relationship between xcoms and dag runs using similar
logic. 

>  Add dag_run to the task_instance table or create new taskuuid column and use this to
uniquely identify a task
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-642
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-642
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: DagRun, scheduler
>            Reporter: Arunprasad
>
> We are planning to run around 40,000 tasks a day using airflow and some of them are critical
to give quick feedback to developers. 
> Currently having execution date to uniquely identify tasks does not work for us since
we mainly trigger dags (instead of running them on schedule) we collide with 1 sec granularity
on several occasions.  Having a task uuid or associating dag_run to task_instance  table and
using this for scheduling and updating status will help us here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message