spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Cheah <mch...@palantir.com>
Subject Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods
Date Fri, 30 Mar 2018 18:10:39 GMT
The question is more so generally what an advised best practice is for setting CPU limits.
It’s not immediately clear what a correct value is for setting CPU limits if one wants to
provide guarantees for consistent / guaranteed execution performance while also not degrading
performance. Additionally, there’s a question of if there exists a sane default CPU limit
in the Spark pod creation code. Such a default seems difficult to set because the JVM can
spawn as many threads as it likes and a single executor can end up thrashing in between its
own threads as they contend for the smaller CPU share that is available.

 

From: Yinan Li <liyinan926@gmail.com>
Date: Thursday, March 29, 2018 at 11:08 PM
To: David Vogelbacher <dvogelbacher@palantir.com>
Cc: "dev@spark.apache.org" <dev@spark.apache.org>
Subject: Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

 

Hi David, 

 

Regarding cpu limit, in Spark 2.3, we do have the following config properties to specify cpu
limit for the driver and executors. See http://spark.apache.org/docs/latest/running-on-kubernetes.html
[spark.apache.org].

 

spark.kubernetes.driver.limit.cores

spark.kubernetes.executor.limit.cores

 

On Thu, Mar 29, 2018 at 5:14 PM, David Vogelbacher <dvogelbacher@palantir.com> wrote:

Hi,

 

At the moment driver and executor pods are created using the following requests and limits:

 CPUMemory
Request[driver,executor].cores[driver,executor].memory
LimitUnlimited (but can be specified using spark.[driver,executor].cores)[driver,executor].memory
+ [driver,executor].memoryOverhead

 

Specifying the requests like this leads to problems if the pods only get the requested amount
of resources and nothing of the optional (limit) resources, as it can happen in a fully utilized
cluster.

 

For memory:

Let’s say we have a node with 100GiB memory and 5 pods with 20 GiB memory and 5 GiB memoryOverhead.


At the beginning all 5 pods use 20 GiB of memory and all is well. If a pod then starts using
its overhead memory it will get killed as there is no more memory available, even though we
told spark

that it can use 25 GiB of memory.

 

Instead of requesting `[driver,executor].memory`, we should just request `[driver,executor].memory
+ [driver,executor].memoryOverhead `.

I think this case is a bit clearer than the CPU case, so I went ahead and filed an issue [issues.apache.org]
with more details and made a PR [github.com].

 

For CPU:

As it turns out, there can be performance problems if we only have `executor.cores` available
(which means we have one core per task). This was raised here [github.com] and is the reason
that the cpu limit was set to unlimited.

This issue stems from the fact that in general there will be more than one thread per task,
resulting in performance impacts if there is only one core available.

However, I am not sure that just setting the limit to unlimited is the best solution because
it means that even if the Kubernetes cluster can perfectly satisfy the resource requests,
performance might be very bad.

 

I think we should guarantee that an executor is able to do its work well (without performance
issues or getting killed - as could happen in the memory case) with the resources it gets
guaranteed from Kubernetes.

 

One way to solve this could be to request more than 1 core from Kubernetes per task. The exact
amount we should request is unclear to me (it largely depends on how many threads actually
get spawned for a task). 

We would need to find a way to determine this somehow automatically or at least come up with
a better default value than 1 core per task.

 

Does somebody have ideas or thoughts on how to solve this best?

 

Best,

David

 


Mime
View raw message