spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 林晨 <bewit...@gmail.com>
Subject Running jobs on Spark which is build by myself fails :(
Date Fri, 10 Apr 2015 13:46:06 GMT
Hi all,
I am new to Spark.
Due to some reasons, I have to add some methods to Breeze 0.11.1. So I
build Breeze and publish it to local .m2 repository by sbt command "sbt
publishM2". Then I changed the reference of Breeze in Spark 1.3.0 pom file
and build Spark by maven command "mvn package". After that, I copy the
spark-assembly-1.3.0-hadoop1.0.4.jar in the lib folder to my cluster. I
also put this jar to lib folder of my project and package my project by sbt
command "sbt assembly". When I submit my project to cluster, it throws
errors as follows:

*Lost task 126.2 in stage 0.0 (TID 449) on executor sr476:
java.lang.NoClassDefFoundError (Could not initialize class
breeze.linalg.DenseVector$) [duplicate 200]*

Exception in thread "main" org.apache.spark.SparkException: Job aborted due
to stage failure: Task 4 in stage 0.0 failed 4 times, most recent failure:
Lost task 4.3 in stage 0.0 (TID 446, sr476):
java.lang.NoClassDefFoundError: Could not initialize class
breeze.linalg.DenseVector$
        at
org.apache.spark.mllib.bigds.ann.DatasetReader$$anonfun$readTrain$3.apply(DatasetReader.scala:61)
        at
org.apache.spark.mllib.bigds.ann.DatasetReader$$anonfun$readTrain$3.apply(DatasetReader.scala:59)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at
org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:249)
        at
org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:172)
        at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:79)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:242)
        at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
        at org.apache.spark.scheduler.Task.run(Task.scala:64)
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)

Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org
$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
        at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
        at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
        at scala.Option.foreach(Option.scala:236)
        at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
        at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
        at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

This problem really make me exhausted. I will be very appreciate if anyone
can help me :)

Mark Lin

Mime
View raw message