spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tgravescs <...@git.apache.org>
Subject [GitHub] spark pull request #18950: [SPARK-20589][Core][Scheduler] Allow limiting tas...
Date Tue, 22 Aug 2017 20:50:22 GMT
Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18950#discussion_r134597897
  
    --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
    @@ -598,13 +600,58 @@ private[spark] class ExecutorAllocationManager(
         private val executorIdToTaskIds = new mutable.HashMap[String, mutable.HashSet[Long]]
         // Number of tasks currently running on the cluster.  Should be 0 when no stages
are active.
         private var numRunningTasks: Int = _
    +    private val jobGroupToMaxConTasks = new mutable.HashMap[String, Int]
    +    private val jobIdToJobGroup = new mutable.HashMap[Int, String]
    +    private val stageIdToJobId = new mutable.HashMap[Int, Int]
    +    private val stageIdToCompleteTaskCount = new mutable.HashMap[Int, Int]
     
         // stageId to tuple (the number of task with locality preferences, a map where each
pair is a
         // node and the number of tasks that would like to be scheduled on that node) map,
         // maintain the executor placement hints for each stage Id used by resource framework
to better
         // place the executors.
         private val stageIdToExecutorPlacementHints = new mutable.HashMap[Int, (Int, Map[String,
Int])]
     
    +    override def onJobStart(jobStart: SparkListenerJobStart): Unit = {
    +      jobStart.stageInfos.foreach(stageInfo => stageIdToJobId(stageInfo.stageId) =
jobStart.jobId)
    +
    +      var jobGroupId = if (jobStart.properties != null) {
    +        jobStart.properties.getProperty(SparkContext.SPARK_JOB_GROUP_ID)
    +      } else {
    +        null
    +      }
    +
    +      val maxConTasks = if (jobGroupId != null &&
    +        conf.contains(s"spark.job.$jobGroupId.maxConcurrentTasks")) {
    +        conf.get(s"spark.job.$jobGroupId.maxConcurrentTasks").toInt
    +      } else {
    +        Int.MaxValue
    +      }
    +
    +      if (maxConTasks <= 0) {
    +        throw new IllegalArgumentException(
    +          "Maximum Concurrent Tasks should be set greater than 0 for the job to progress.")
    +      }
    +
    +      if (jobGroupId == null || !conf.contains(s"spark.job.$jobGroupId.maxConcurrentTasks"))
{
    +        jobGroupId = "default-group-" + jobStart.jobId.hashCode
    --- End diff --
    
    seems unlikely user would specify the same name but wonder if we add __ in front of it
would be a bit more unique. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message