spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [spark] Ngone51 commented on a change in pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle
Date Thu, 22 Jul 2021 15:14:05 GMT

Ngone51 commented on a change in pull request #33034:
URL: https://github.com/apache/spark/pull/33034#discussion_r674904070



##########
File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
##########
@@ -1323,9 +1323,9 @@ private[spark] class DAGScheduler(
     // `findMissingPartitions()` returns all partitions every time.
     stage match {
       case sms: ShuffleMapStage if stage.isIndeterminate && !sms.isAvailable =>
-        // TODO: SPARK-32923: Clean all push-based shuffle metadata like merge enabled and
-        // TODO: finalized as we are clearing all the merge results.
         mapOutputTracker.unregisterAllMapAndMergeOutput(sms.shuffleDep.shuffleId)
+        sms.shuffleDep.resetShuffleMergeState()
+        sms.shuffleDep.newShuffleSequenceId()

Review comment:
       I think we can just do them together, e.g., `newShuffleMergeState()` should be fine.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message