airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] vardancse commented on a change in pull request #3994: [AIRFLOW-3136] Add retry_number to TaskInstance Key property to avoid race condition
Date Tue, 09 Oct 2018 06:57:44 GMT
vardancse commented on a change in pull request #3994: [AIRFLOW-3136] Add retry_number to TaskInstance
Key property to avoid race condition
URL: https://github.com/apache/incubator-airflow/pull/3994#discussion_r223579061
 
 

 ##########
 File path: airflow/models.py
 ##########
 @@ -1230,7 +1230,7 @@ def key(self):
         """
         Returns a tuple that identifies the task instance uniquely
         """
-        return self.dag_id, self.task_id, self.execution_date
+        return self.dag_id, self.task_id, self.execution_date, self.try_number
 
 Review comment:
   This collision situation is arising because of the following steps happening periodically
by scheduler.
   
   1.  In the _execute_helper method, heartbeat method is being called, which calls execute_async
in further and putting key(dag_id, task_id, execution_date) and command to executor queue
and then from there onwards based on executor type, task starts executing asynchronously.
   2.  Sync methods runs followed by execute_async method which push keys from result_queue
to event_buffer
   3.  _process_executor_events method being called followed by hearbeat method which detects
external killing of task, if TI.state is queued and event_buffer key status is success/failed
   
   Another suggestion for change could be taking out execute_async method from heartbeat method
and call it after _process_executor_events method call in _execute_helper method in jobs.py
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message