spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-20662) Block jobs that have greater than a configured number of tasks
Date Fri, 02 Jun 2017 22:15:04 GMT

    [ https://issues.apache.org/jira/browse/SPARK-20662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035519#comment-16035519
] 

Xuefu Zhang commented on SPARK-20662:
-------------------------------------

I can understand the counter argument here if Spark is targeted for single user cases. For
multiple users in an enterprise deployment, it's good to provide admin knobs. In this case,
an admin just wanted to block bad jobs. I don't think RM meets that goal.

This is actually implemented in Hive on Spark. However, I thought this is generic and may
be desirable for others as well. In addition, blocking a job at submission is better than
killing it after it started to run.

If Spark doesn't think this is useful, then very well.

> Block jobs that have greater than a configured number of tasks
> --------------------------------------------------------------
>
>                 Key: SPARK-20662
>                 URL: https://issues.apache.org/jira/browse/SPARK-20662
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.6.0, 2.0.0
>            Reporter: Xuefu Zhang
>
> In a shared cluster, it's desirable for an admin to block large Spark jobs. While there
might not be a single metrics defining the size of a job, the number of tasks is usually a
good indicator. Thus, it would be useful for Spark scheduler to block a job whose number of
tasks reaches a configured limit. By default, the limit could be just infinite, to retain
the existing behavior.
> MapReduce has mapreduce.job.max.map and mapreduce.job.max.reduce to be configured, which
blocks a MR job at job submission time.
> The proposed configuration is spark.job.max.tasks with a default value -1 (infinite).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message