spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-24547) Spark on K8s docker-image-tool.sh improvements
Date Wed, 13 Jun 2018 11:36:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-24547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510971#comment-16510971
] 

Apache Spark commented on SPARK-24547:
--------------------------------------

User 'rayburgemeestre' has created a pull request for this issue:
https://github.com/apache/spark/pull/21555

> Spark on K8s docker-image-tool.sh improvements
> ----------------------------------------------
>
>                 Key: SPARK-24547
>                 URL: https://issues.apache.org/jira/browse/SPARK-24547
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 2.4.0
>            Reporter: Ray Burgemeestre
>            Priority: Minor
>              Labels: docker, kubernetes, spark
>
> *Context*
> PySpark support for Spark on k8s was merged with [https://github.com/apache/spark/pull/21092/files] few
days ago
> There is a helper script that can be used to create docker containers to run java and
now also python jobs. It works like this:
> {{/path/to/docker-image-tool.sh -r node001:5000/brightcomputing -t v2.4.0 build}}
>  {{/path/to/docker-image-tool.sh -r node001:5000/brightcomputing -t v2.4.0 push}}
> *Problem*
> I ran into three two issues. First time I generated images for 2.4.0 Docker was using
it's cache, so actually when running jobs, old jars where still in the Docker image. This
produces errors like this in the executors:
> {code:java}
> 2018-06-13 10:27:52 INFO NettyBlockTransferService:54 - Server created on 172.29.3.4:44877^M
2018-06-13 10:27:52 INFO BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy
for block replication policy^M 2018-06-13 10:27:52 INFO BlockManagerMaster:54 - Registering
BlockManager BlockManagerId(1, 172.29.3.4, 44877, None)^M 2018-06-13 10:27:52 ERROR CoarseGrainedExecutorBackend:91
- Executor self-exiting due to : Unable to create executor due to Exception thrown in awaitResult:
^M org.apache.spark.SparkException: Exception thrown in awaitResult: ^M ^Iat org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)^M
^Iat org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)^M ^Iat org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)^M
^Iat org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:76)^M ^Iat org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:64)^M
^Iat org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:241)^M ^Iat org.apache.spark.executor.Executor.<init>(Executor.scala:116)^M
^Iat org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:83)^M
^Iat org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:117)^M ^Iat
org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:205)^M ^Iat org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:101)^M
^Iat org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)^M ^Iat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)^M
^Iat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)^M ^Iat
java.lang.Thread.run(Thread.java:748)^M Caused by: java.lang.RuntimeException: java.io.InvalidClassException:
org.apache.spark.storage.BlockManagerId; local class incompatible: stream classdesc serialVersionUID
= 6155820641931972169, local class serialVersionUID = -3720498261147521051^M ^Iat java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:687)^M
^Iat java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1880)^M ^Iat java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1746)^M
> {code}
> To avoid that Docker has to build without it's cache, but only if you have build for
an older version in the past...
> The second problem was that the spark container is pushed, but the spark-py container
wasn't yet. This was just forgotten in the initial PR.
> (A third problem I also ran into because I had an older docker was [https://github.com/apache/spark/pull/21551]
so I have not included a fix for that in this ticket.)
> Other than that it works great!
> *Solution*
> I've added an extra flag so it's possible to call build with `-n` for --no-cache`.
> And I've added the extra push for the spark-py container.
> *Example*
> ./bin/docker-image-tool.sh -r docker.io/myrepo -t v2.3.0 -n build
> Snippet from the help output:
> {code:java}
> Options:
> -f file Dockerfile to build for JVM based Jobs. By default builds the Dockerfile shipped
with Spark.
> -p file Dockerfile with Python baked in. By default builds the Dockerfile shipped with
Spark.
> -r repo Repository address.
> -t tag Tag to apply to the built image, or to identify the image to be pushed.
> -m Use minikube's Docker daemon.
> -n Build docker image with --no-cache{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message