spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [spark] itskals commented on a change in pull request #26975: [SPARK-26975][CORE] Stage retry and executor crash cause app hung up forever
Date Mon, 23 Dec 2019 06:45:47 GMT
itskals commented on a change in pull request #26975: [SPARK-26975][CORE] Stage retry and executor
crash cause app hung up forever
URL: https://github.com/apache/spark/pull/26975#discussion_r360782455
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala
 ##########
 @@ -100,6 +100,11 @@ private[spark] class TaskSetManager(
   // should not resubmit while executor lost.
   private val killedByOtherAttempt = new HashSet[Long]
 
+  // Add the tid of task into this HashSet when the task is killed by other stage retries.
+  // For example, if stage failed and retry, when the task in the origin stage finish, it
will
+  // kill the new stage task running the same partition data
+  private val killedByOtherStageRetries = new HashSet[Long]
 
 Review comment:
   Also the part of code that you marked as problematic in `executorLost` , could it have
not been moved to `handleFailedTask`? I feel the code could have looked more clearer there
and then rest of the changes might not have been needed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message