spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timothy Chen <tnac...@gmail.com>
Subject Re: [Spark on mesos] Spark framework not re-registered and lost after mesos master restarted
Date Fri, 31 Mar 2017 03:33:42 GMT
Hi Yu,

As mentioned earlier, currently the Spark framework will not
re-register as the failover_timeout is not set and there is no
configuration available yet.
It's only enabled in MesosClusterScheduler since it's meant to be a HA
framework.

We should add that configuration for users that want their Spark
frameworks to be able to failover in case of Master failover or
network disconnect, etc.

Tim

On Thu, Mar 30, 2017 at 8:25 PM, Yu Wei <yu2003w@hotmail.com> wrote:
> Hi Tim,
>
> I tested the scenario again with settings as below,
>
> [dcos@agent spark-2.0.2-bin-hadoop2.7]$ cat conf/spark-defaults.conf
> spark.deploy.recoveryMode  ZOOKEEPER
> spark.deploy.zookeeper.url 192.168.111.53:2181
> spark.deploy.zookeeper.dir /spark
> spark.executor.memory 512M
> spark.mesos.principal agent-dev-1
>
>
> However, the case still failed. After master restarted, spark framework did
> not re-register.
> From spark framework log, it seemed that below method in
> MesosClusterScheduler was not called.
> override def reregistered(driver: SchedulerDriver, masterInfo: MasterInfo):
> Unit
>
> Did I miss something? Any advice?
>
>
> Thanks,
>
> Jared, (韦煜)
> Software developer
> Interested in open source software, big data, Linux
>
>
>
> ________________________________
> From: Timothy Chen <tnachen@gmail.com>
> Sent: Friday, March 31, 2017 5:13 AM
> To: Yu Wei
> Cc: users@spark.apache.org; dev
> Subject: Re: [Spark on mesos] Spark framework not re-registered and lost
> after mesos master restarted
>
> I think failover isn't enabled on regular Spark job framework, since we
> assume jobs are more ephemeral.
>
> It could be a good setting to add to the Spark framework to enable failover.
>
> Tim
>
> On Mar 30, 2017, at 10:18 AM, Yu Wei <yu2003w@hotmail.com> wrote:
>
> Hi guys,
>
> I encountered a problem about spark on mesos.
>
> I setup mesos cluster and launched spark framework on mesos successfully.
>
> Then mesos master was killed and started again.
>
> However, spark framework couldn't be re-registered again as mesos agent
> does. I also couldn't find any error logs.
>
> And MesosClusterDispatcher is still running there.
>
>
> I suspect this is spark framework issue.
>
> What's your opinion?
>
>
>
> Thanks,
>
> Jared, (韦煜)
> Software developer
> Interested in open source software, big data, Linux

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message