spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrea Esposito <and1...@gmail.com>
Subject Re: KryoSerializer Exception
Date Fri, 30 May 2014 13:50:12 GMT
Hi,

i just migrate to 1.0. Still having the same issue.

Either with or without the custom registrator. Just the usage of the
KryoSerializer triggers the exception immediately.

I set the kryo settings through the property:
System.setProperty("spark.serializer", "org.apache.spark.serializer.
KryoSerializer")
System.setProperty("spark.kryo.registrator", "it.unipi.thesis.andrea.
esposito.onjag.test.TestKryoRegistrator")

The registrator is just a sequence of:
kryo.register(classOf[MyClass])

I tried also with very small RDD (few MB of serialized data) and the
problem still occurs.

The problem seems about broadcast but i'm completely stuck.

Following complete log:








































































































*2014-05-30 15:47:36 WARN  TaskSetManager:70 - Lost TID 5 (task 3.0:1)
2014-05-30 15:47:36 WARN  TaskSetManager:70 - Loss was due to
java.io.EOFExceptionjava.io.EOFException     at
org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:119)
    at
org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:205)
at
org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:89)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)     at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
    at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)    at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
    at
org.apache.spark.scheduler.ShuffleMapTask$.deserializeInfo(ShuffleMapTask.scala:63)
    at
org.apache.spark.scheduler.ShuffleMapTask.readExternal(ShuffleMapTask.scala:135)
    at
java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
    at
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)    at
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)     at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
    at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85)
    at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:169)    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745) 2014-05-30 15:47:36 WARN
TaskSetManager:70 - Lost TID 4 (task 3.0:0)2014-05-30 15:47:36 WARN
TaskSetManager:70 - Lost TID 6 (task 3.0:1) 2014-05-30 15:47:36 WARN
TaskSetManager:70 - Lost TID 7 (task 3.0:0)2014-05-30 15:47:36 WARN
TaskSetManager:70 - Lost TID 8 (task 3.0:1) 2014-05-30 15:47:36 WARN
TaskSetManager:70 - Lost TID 9 (task 3.0:0)2014-05-30 15:47:36 WARN
TaskSetManager:70 - Lost TID 10 (task 3.0:1) 2014-05-30 15:47:36 ERROR
TaskSetManager:74 - Task 3.0:1 failed 4 times; aborting jobException in
thread "main" org.apache.spark.SparkException: Job aborted due to stage
failure: Task 3.0:1 failed 4 times, most recent failure: Exception failure
in TID 10 on host Andrea-Laptop.unipi.it: java.io.EOFException
org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:119)

org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:205)
org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:89)
        sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        java.lang.reflect.Method.invoke(Method.java:606)
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)

java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)

java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)

java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)

java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)

org.apache.spark.scheduler.ShuffleMapTask$.deserializeInfo(ShuffleMapTask.scala:63)

org.apache.spark.scheduler.ShuffleMapTask.readExternal(ShuffleMapTask.scala:135)

java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)

java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)

org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85)

org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:169)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745) Driver stacktrace:    at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
    at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
    at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
    at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)    at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
    at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
    at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
    at scala.Option.foreach(Option.scala:236)    at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
    at
org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)    at
akka.actor.ActorCell.invoke(ActorCell.scala:456)     at
akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)    at
akka.dispatch.Mailbox.run(Mailbox.scala:219)     at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
    at
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)    at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)*


2014-05-27 16:25 GMT+02:00 jaranda <jordi.aranda@bsc.es>:

> I am experiencing the same issue (I tried both using Kryo as serializer and
> increasing the buffer size up to 256M, my objects are much smaller though).
> I share my registrator class just in case:
>
> https://gist.github.com/JordiAranda/5cc16cf102290c413c82
>
> Any hints would be highly appreciated.
>
> Thanks,
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/KryoSerializer-Exception-tp5435p6428.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Mime
View raw message