spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <>
Subject [GitHub] [spark] shanyu opened a new pull request #27781: [SPARK-31028] Add "-XX:ActiveProcessorCount" to Spark driver and executor in Yarn mode
Date Wed, 04 Mar 2020 01:46:55 GMT
shanyu opened a new pull request #27781: [SPARK-31028] Add "-XX:ActiveProcessorCount" to Spark
driver and executor in Yarn mode
   # What changes were proposed in this pull request?
   When starting Spark driver and executors on Yarn cluster, the JVM process can discover
all CPU cores on the system and set thread-pool or GC threads based on that value. We should
limit what the JVM sees for the number of cores set by the user (spark.driver.cores or spark.executor.cores)
by "-XX:ActiveProcessorCount", which was introduced in Java 8u191.
   Especially in running Spark on Yarn inside Kubernetes container, the number of CPU cores
discovered sometimes is 1, which means it always use 1 thread in the default thread pool,
or GC threads.
   ### Why are the changes needed?
   Without the change, when running Spark on Yarn, the number of available processors discovered
by JVM is not correct. User has assigned driver and executors the number of cores to use and
we should honor that. A simple test would be using this Java code:
   ### Does this PR introduce any user-facing change?
   ### How was this patch tested?
   It is a simple change to the JVM start command, verified manually.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message