spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <>
Subject [jira] [Commented] (SPARK-24547) Spark on K8s improvements
Date Wed, 13 Jun 2018 11:36:00 GMT


Apache Spark commented on SPARK-24547:

User 'rayburgemeestre' has created a pull request for this issue:

> Spark on K8s improvements
> ----------------------------------------------
>                 Key: SPARK-24547
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 2.4.0
>            Reporter: Ray Burgemeestre
>            Priority: Minor
>              Labels: docker, kubernetes, spark
> *Context*
> PySpark support for Spark on k8s was merged with [] few
days ago
> There is a helper script that can be used to create docker containers to run java and
now also python jobs. It works like this:
> {{/path/to/ -r node001:5000/brightcomputing -t v2.4.0 build}}
>  {{/path/to/ -r node001:5000/brightcomputing -t v2.4.0 push}}
> *Problem*
> I ran into three two issues. First time I generated images for 2.4.0 Docker was using
it's cache, so actually when running jobs, old jars where still in the Docker image. This
produces errors like this in the executors:
> {code:java}
> 2018-06-13 10:27:52 INFO NettyBlockTransferService:54 - Server created on^M
2018-06-13 10:27:52 INFO BlockManager:54 - Using
for block replication policy^M 2018-06-13 10:27:52 INFO BlockManagerMaster:54 - Registering
BlockManager BlockManagerId(1,, 44877, None)^M 2018-06-13 10:27:52 ERROR CoarseGrainedExecutorBackend:91
- Executor self-exiting due to : Unable to create executor due to Exception thrown in awaitResult:
^M org.apache.spark.SparkException: Exception thrown in awaitResult: ^M ^Iat org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)^M
^Iat org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)^M ^Iat org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)^M
^Iat org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:76)^M ^Iat^M
^Iat^M ^Iat org.apache.spark.executor.Executor.<init>(Executor.scala:116)^M
^Iat org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:83)^M
^Iat org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:117)^M ^Iat
org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:205)^M ^Iat org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:101)^M
^Iat org.apache.spark.rpc.netty.Dispatcher$^M ^Iat java.util.concurrent.ThreadPoolExecutor.runWorker(^M
^Iat java.util.concurrent.ThreadPoolExecutor$^M ^Iat^M Caused by: java.lang.RuntimeException:; local class incompatible: stream classdesc serialVersionUID
= 6155820641931972169, local class serialVersionUID = -3720498261147521051^M ^Iat^M
^Iat^M ^Iat^M
> {code}
> To avoid that Docker has to build without it's cache, but only if you have build for
an older version in the past...
> The second problem was that the spark container is pushed, but the spark-py container
wasn't yet. This was just forgotten in the initial PR.
> (A third problem I also ran into because I had an older docker was []
so I have not included a fix for that in this ticket.)
> Other than that it works great!
> *Solution*
> I've added an extra flag so it's possible to call build with `-n` for --no-cache`.
> And I've added the extra push for the spark-py container.
> *Example*
> ./bin/ -r -t v2.3.0 -n build
> Snippet from the help output:
> {code:java}
> Options:
> -f file Dockerfile to build for JVM based Jobs. By default builds the Dockerfile shipped
with Spark.
> -p file Dockerfile with Python baked in. By default builds the Dockerfile shipped with
> -r repo Repository address.
> -t tag Tag to apply to the built image, or to identify the image to be pushed.
> -m Use minikube's Docker daemon.
> -n Build docker image with --no-cache{code}

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message