spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Single point of failure with Driver host crashing
Date Thu, 11 Aug 2016 19:58:05 GMT
Have you read
https://spark.apache.org/docs/latest/spark-standalone.html#high-availability
?

FYI

On Thu, Aug 11, 2016 at 12:40 PM, Mich Talebzadeh <mich.talebzadeh@gmail.com
> wrote:

>
> Hi,
>
> Although Spark is fault tolerant when nodes go down like below:
>
> FROM tmp
> [Stage 1:===========>                                           (20 + 10)
> / 100]16/08/11 20:21:34 ERROR TaskSchedulerImpl: Lost executor 3 on
> xx.xxx.197.216: worker lost
> [Stage 1:========================>                               (44 + 8)
> / 100]
> It can carry on.
>
> However, when the node (the host) that the app was started  on goes down
> the job fails as the driver disappears  as well. Is there a way to avoid
> this single point of failure, assuming what I am stating is valid?
>
>
> Thanks
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>

Mime
View raw message