predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Donald Szeto <don...@apache.org>
Subject Re: Spark Executor lost error
Date Thu, 19 Jan 2017 17:44:08 GMT
Do you have more detail logs from Spark executors?

Regards,
Donald

On Wed, Dec 14, 2016 at 3:26 AM Bansari Shah <bansari.jan93@gmail.com>
wrote:

> Hi all,
>
> I am trying to use spark environment in predict function of ML engine for
> text analysis. It extends P2LAlgorithm algorithm. System works on
> standalone cluster.
>
> Predict function for new query is as below :
>
> override def predict(model: NaiveBayesModel, query: Query):
> PredictedResult = {
> val sc_new = SparkContext.getOrCreate()
> val sqlContext = SQLContext.getOrCreate(sc_new)
> val phraseDataframe = sqlContext.createDataFrame(Seq(query)).toDF("text")
> val dpObj = new DataPreparator
> val tf = dpObj.processPhrase(phraseDataframe)
>
> tf.show()
>
> val labeledpoints = tf.map(row => row.getAs[Vector]("rowFeatures"))
> val predictedResult = model.predict(labeledpoints)
> *return *predictedResult
>
>
> it trains properly in pio train and while deploying as well it predicts
> results properly for single query.
>
>  But in case of pio eval, when i try to check accuracy of model it runs
> upto tf.show() properly but when forming labelled point statement comes, it
> stuck and after waiting for long it shows error that it lost spark executor
> and no heartbeat received. Here it is error log :
>
> WARN org.apache.spark.HeartbeatReceiver
> [sparkDriver-akka.actor.default-dispatcher-14] - Removing executor driver
> with no recent heartbeats: 686328 ms exceeds timeout 120000 ms
>
>
>
> ERROR org.apache.spark.scheduler.TaskSchedulerImpl
> [sparkDriver-akka.actor.default-dispatcher-14] - Lost executor driver on
> localhost: Executor heartbeat timed out after 686328 ms
>
>
>
> WARN org.apache.spark.scheduler.TaskSetManager
> [sparkDriver-akka.actor.default-dispatcher-14] - Lost task 3.0 in stage
> 103.0 (TID 237, localhost): ExecutorLostFailure (executor driver lost)
>
>
>
> ERROR org.apache.spark.scheduler.TaskSetManager
> [sparkDriver-akka.actor.default-dispatcher-14] - Task 3 in stage 103.0
> failed 1 times; aborting job
> ......
> org.apache.spark.SparkException: Job cancelled because SparkContext was
> shut down
>
>
> Please suggest me how to solve this issue.
>
> Thank you.
> Regards,
> Bansari Shah
>
>
>
>

Mime
View raw message