spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Silvio Fiorito <silvio.fior...@granturing.com>
Subject Re: Submit Spark application in cluster mode and supervised
Date Fri, 08 May 2015 18:34:27 GMT
If you’re using multiple masters with ZooKeeper then you should set your master URL to be

spark://host01:7077,host02:7077

And the property spark.deploy.recoveryMode=ZOOKEEPER

See here for more info: http://spark.apache.org/docs/latest/spark-standalone.html#standby-masters-with-zookeeper

From: James King
Date: Friday, May 8, 2015 at 11:22 AM
To: user
Subject: Submit Spark application in cluster mode and supervised

I have two hosts host01 and host02 (lets call them)

I run one Master and two Workers on host01
I also run one Master and two Workers on host02

Now I have 1 LIVE Master on host01 and a STANDBY Master on host02
The LIVE Master is aware of all Workers in the cluster

Now I submit a Spark application using

bin/spark-submit --class SomeApp --deploy-mode cluster --supervise --master spark://host01:7077
Some.jar

This to make the driver resilient to failure.

Now the interesting part:

If I stop the cluster (all daemons on all hosts) and restart the Master and Workers only on
host01 the job resumes! as expected.

But if I stop the cluster (all daemons on all hosts) and restart the Master and Workers only
on host02 the job does not resume execution! why?

I can see the driver on host02 WebUI listed but no job execution. Please let me know why.

Am I wrong to expect it to resume execution in this case?





Mime
View raw message