spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From andrewor14 <>
Subject [GitHub] spark pull request: [SPARK-3571] Spark standalone cluster mode doe...
Date Wed, 17 Sep 2014 21:25:32 GMT
Github user andrewor14 commented on a diff in the pull request:
    --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala ---
    @@ -491,14 +491,13 @@ private[spark] class Master(
         val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state == WorkerState.ALIVE))
         val aliveWorkerNum = shuffledAliveWorkers.size
         var curPos = 0
    +    var stopPos = aliveWorkerNum
         for (driver <- waitingDrivers.toList) { // iterate over a copy of waitingDrivers
           // We assign workers to each waiting driver in a round-robin fashion. For each
driver, we
           // start from the last worker that was assigned a driver, and continue onwards
until we have
           // explored all alive workers.
    -      curPos = (curPos + 1) % aliveWorkerNum
    -      val startPos = curPos
    --- End diff --
    I still don't understand what you mean. The behavior we want here is this: for each driver,
we start from the position where the last driver left off, and we want to loop through each
worker at most once (i.e. we don't want to exit the loop as soon as we have looked at `numWorkersAlive`).
If the last driver's last visited index is N, then we still start from N + 1 because we keep
track of the position `curPos`. `numWorkersVisited` is a counter, not an index into `shuffledWorkers`.
Does that make sense?

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at or file a JIRA ticket
with INFRA.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message