spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dirceu Semighini Filho <dirceu.semigh...@gmail.com>
Subject Re: ClassNotFoundException in RDD.map
Date Thu, 17 Mar 2016 17:25:20 GMT
Hi Ted, thanks for answering.
The map is just that, whenever I try inside the map it throws this
ClassNotFoundException, even if I do map(f => f) it throws the exception.
What is bothering me is that when I do a take or a first it returns the
result, which make me conclude that the previous code isn't wrong.

Kind Regards,
Dirceu

2016-03-17 12:50 GMT-03:00 Ted Yu <yuzhihong@gmail.com>:

> bq. $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1
>
> Do you mind showing more of your code involving the map() ?
>
> On Thu, Mar 17, 2016 at 8:32 AM, Dirceu Semighini Filho <
> dirceu.semighini@gmail.com> wrote:
>
>> Hello,
>> I found a strange behavior after executing a prediction with MLIB.
>> My code return an RDD[(Any,Double)] where Any is the id of my dataset,
>> which is BigDecimal, and Double is the prediction for that line.
>> When I run
>> myRdd.take(10) it returns ok
>> res16: Array[_ >: (Double, Double) <: (Any, Double)] =
>> Array((1921821857196754403.000000000000000000,0.1690292052496703),
>> (4545756323742222427.000000000000000000,0.16902820241892452),
>> (989198096568001939.000000000000000000,0.16903432789699502),
>> (14284129652106187990.000000000000000000,0.16903517653451386),
>> (17980228074225252497.000000000000000000,0.16903151028332508),
>> (3861345958263692781.000000000000000000,0.16903056986183976),
>> (17558198701997383205.000000000000000000,0.1690295450319745),
>> (10651576092054552310.000000000000000000,0.1690286445174418),
>> (4534494349035056215.000000000000000000,0.16903303401862327),
>> (5551671513234217935.000000000000000000,0.16902303368995966))
>> But when I try to run some map on it:
>> myRdd.map(_._1).take(10)
>> It throws a ClassCastException:
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
>> in stage 72.0 failed 4 times, most recent failure: Lost task 0.3 in stage
>> 72.0 (TID 1774, 172.31.23.208): java.lang.ClassNotFoundException:
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:278)
>> at
>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
>> at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>> at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>> at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>> at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>> at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>> at
>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
>> at
>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>> at org.apache.spark.scheduler.Task.run(Task.scala:88)
>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:745)
>> Driver stacktrace:
>> at org.apache.spark.scheduler.DAGScheduler.org
>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
>> at
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>> at
>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>> at scala.Option.foreach(Option.scala:236)
>> at
>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
>> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>> at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1850)
>> at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1302)
>> at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>> at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
>> at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
>> at org.apache.spark.rdd.RDD.take(RDD.scala:1276)
>> at org.apache.spark.rdd.RDD$$anonfun$first$1.apply(RDD.scala:1316)
>> at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>> at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
>> at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
>> at org.apache.spark.rdd.RDD.first(RDD.scala:1315)
>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:34)
>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:39)
>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:41)
>> at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:43)
>> at $iwC$$iwC$$iwC$$iwC.<init>(<console>:45)
>> at $iwC$$iwC$$iwC.<init>(<console>:47)
>> at $iwC$$iwC.<init>(<console>:49)
>> at $iwC.<init>(<console>:51)
>> at <init>(<console>:53)
>> at .<init>(<console>:57)
>> at .<clinit>(<console>)
>> at .<init>(<console>:7)
>> at .<clinit>(<console>)
>> at $print(<console>)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:497)
>> at
>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
>> at
>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340)
>> at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
>> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
>> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
>> at
>> org.apache.zeppelin.spark.IUberSparkInterpreter.interpretInput(IUberSparkInterpreter.java:865)
>> at
>> org.apache.zeppelin.spark.IUberSparkInterpreter.interpret(IUberSparkInterpreter.java:830)
>> at
>> org.apache.zeppelin.spark.IUberSparkInterpreter.interpret(IUberSparkInterpreter.java:818)
>> at
>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57)
>> at
>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
>> at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:331)
>> at org.apache.zeppelin.scheduler.Job.run(Job.java:171)
>> at
>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.ClassNotFoundException:
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:278)
>> at
>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
>> at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>> at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>> at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>> at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921)
>> at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>> at
>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
>> at
>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>> at org.apache.spark.scheduler.Task.run(Task.scala:88)
>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> ... 1 more
>>
>> I'm using spark 1.5.2 compiled with hadoop 2.6.
>> Does anybody know how to avoid this exception?
>>
>> Kind Regards,
>> Dirceu
>>
>
>

Mime
View raw message