Retrying what ? I want to know why is it died , and what can i do to prevent ?

On Wed, Oct 14, 2015 at 5:20 PM, Raghavendra Pandey <raghavendra.pandey@gmail.com> wrote:

I fixed these timeout errors by retrying...

On Oct 15, 2015 3:41 AM, "Kartik Mathur" <kartik@bluedata.com> wrote:
Hi,

I have some nightly jobs which runs every night but dies sometimes because of unresponsive master , spark master logs says - 

Not seeing much else there , what could possible cause an exception like this.

Exception in thread "main" java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]

at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)

at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)

at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

at scala.concurrent.Await$.result(package.scala:107)

at akka.remote.Remoting.start(Remoting.scala:180)

at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)

at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:618)

at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:615)

at akka.actor.ActorSystemImpl._start(ActorSystem.scala:615)

at akka.actor.ActorSystemImpl.start(ActorSystem.scala:632)

at akka.actor.ActorSystem$.apply(ActorSystem.scala:141)

2015-10-14 05:43:04 ERROR Remoting:65 - Remoting error: [Startup timed out] [

akka.remote.RemoteTransportException: Startup timed out

at akka.remote.Remoting.akka$remote$Remoting$$notifyError(Remoting.scala:136)

at akka.remote.Remoting.start(Remoting.scala:198)

at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)

at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:618)

at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:615)

at akka.actor.ActorSystemImpl._start(ActorSystem.scala:615)

at akka.actor.ActorSystemImpl.start(ActorSystem.scala:632)

at akka.actor.ActorSystem$.apply(ActorSystem.scala:141)

at akka.actor.ActorSystem$.apply(ActorSystem.scala:118)

at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:122)

at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:55)

at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54)

at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1837)

at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)

at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1828)

at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:57)

at org.apache.spark.deploy.master.Master$.startSystemAndActor(Master.scala:906)

at org.apache.spark.deploy.master.Master$.main(Master.scala:869)

at org.apache.spark.deploy.master.Master.main(Master.scala)

Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]

at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)

at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)

at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

at scala.concurrent.Await$.result(package.scala:107)

at akka.remote.Remoting.start(Remoting.scala:180)

... 17 more