spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Debasish Das <debasish.da...@gmail.com>
Subject Re: Lost executor on YARN ALS iterations
Date Wed, 20 Aug 2014 07:27:39 GMT
I could reproduce the issue in both 1.0 and 1.1 using YARN...so this is
definitely a YARN related problem...

At least for me right now only deployment option possible is standalone...



On Tue, Aug 19, 2014 at 11:29 PM, Xiangrui Meng <mengxr@gmail.com> wrote:

> Hi Deb,
>
> I think this may be the same issue as described in
> https://issues.apache.org/jira/browse/SPARK-2121 . We know that the
> container got killed by YARN because it used much more memory that it
> requested. But we haven't figured out the root cause yet.
>
> +Sandy
>
> Best,
> Xiangrui
>
> On Tue, Aug 19, 2014 at 8:51 PM, Debasish Das <debasish.das83@gmail.com>
> wrote:
> > Hi,
> >
> > During the 4th ALS iteration, I am noticing that one of the executor gets
> > disconnected:
> >
> > 14/08/19 23:40:00 ERROR network.ConnectionManager: Corresponding
> > SendingConnectionManagerId not found
> >
> > 14/08/19 23:40:00 INFO cluster.YarnClientSchedulerBackend: Executor 5
> > disconnected, so removing it
> >
> > 14/08/19 23:40:00 ERROR cluster.YarnClientClusterScheduler: Lost
> executor 5
> > on tblpmidn42adv-hdp.tdc.vzwcorp.com: remote Akka client disassociated
> >
> > 14/08/19 23:40:00 INFO scheduler.DAGScheduler: Executor lost: 5 (epoch
> 12)
> > Any idea if this is a bug related to akka on YARN ?
> >
> > I am using master
> >
> > Thanks.
> > Deb
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message