spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiangrui Meng <men...@gmail.com>
Subject Re: Lost executor on YARN ALS iterations
Date Wed, 20 Aug 2014 06:29:31 GMT
Hi Deb,

I think this may be the same issue as described in
https://issues.apache.org/jira/browse/SPARK-2121 . We know that the
container got killed by YARN because it used much more memory that it
requested. But we haven't figured out the root cause yet.

+Sandy

Best,
Xiangrui

On Tue, Aug 19, 2014 at 8:51 PM, Debasish Das <debasish.das83@gmail.com> wrote:
> Hi,
>
> During the 4th ALS iteration, I am noticing that one of the executor gets
> disconnected:
>
> 14/08/19 23:40:00 ERROR network.ConnectionManager: Corresponding
> SendingConnectionManagerId not found
>
> 14/08/19 23:40:00 INFO cluster.YarnClientSchedulerBackend: Executor 5
> disconnected, so removing it
>
> 14/08/19 23:40:00 ERROR cluster.YarnClientClusterScheduler: Lost executor 5
> on tblpmidn42adv-hdp.tdc.vzwcorp.com: remote Akka client disassociated
>
> 14/08/19 23:40:00 INFO scheduler.DAGScheduler: Executor lost: 5 (epoch 12)
> Any idea if this is a bug related to akka on YARN ?
>
> I am using master
>
> Thanks.
> Deb

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message