spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Or (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-7180) SerializationDebugger fails with ArrayOutOfBoundsException
Date Tue, 28 Apr 2015 04:04:05 GMT

     [ https://issues.apache.org/jira/browse/SPARK-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Or updated SPARK-7180:
-----------------------------
    Description: 
Simple reproduction:
{code}
class Parent extends Serializable {
  val a = "a"
  val b = "b"
}

class Child extends Parent with Serializable {
  val c = Array(1)
  val d = Array(2)
  val e = Array(3)
  val f = Array(4)
  val g = Array(5)
  val o = new Object
}

// ArrayOutOfBoundsException
SparkEnv.get.closureSerializer.newInstance().serialize(new Child)
{code}

I dug into this a little and found that we are trying to fill the fields of `Parent` with
the values of `Child`. See the following output I generated by adding println's everywhere:
{code}
* Visiting object org.apache.spark.serializer.Child@2c3299f6 of type org.apache.spark.serializer.Child
  - Found 2 class data slot descriptions
  - Looking at desc #1: org.apache.spark.serializer.Parent: static final long serialVersionUID
= 3254964199136071914L;
    - Found 2 fields
      - Ljava/lang/String; a
      - Ljava/lang/String; b
    - getObjFieldValues: 
      - [I@23faa614
      - [I@1cad7d80
      - [I@420a6d35
      - [I@3a87d472
      - [I@2b8ca663
      - java.lang.Object@1effc3eb
{code}
SerializationDebugger#visitSerializable found two fields that belong to the parents, but it
tried to cram the child's values into these two fields. The mismatch of number of fields here
throws the ArrayOutOfBoundExceptions as a result. The culprit is this line: https://github.com/apache/spark/blob/4d9e560b5470029143926827b1cb9d72a0bfbeff/core/src/main/scala/org/apache/spark/serializer/SerializationDebugger.scala#L150,
which runs reflection on the object `Child` even when it's considering the description for
`Parent`.

I ran into this when trying to serialize a test suite that extends `FunSuite` (don't ask why).

  was:
Simple reproduction:
{code}
class Parent extends Serializable {
  val a = "a"
  val b = "b"
}

class Child extends Parent with Serializable {
  val c = Array(1)
  val d = Array(2)
  val e = Array(3)
  val f = Array(4)
  val g = Array(5)
  val o = new Object
}

// ArrayOutOfBoundsException
SparkEnv.get.closureSerializer.newInstance().serialize(new Child)
{code}

I dug into this a little and found that we are trying to fill the fields of `Parent` with
the values of `Child`. See the following output I generated by adding println's everywhere:
{code}
* Visiting object org.apache.spark.serializer.Child@2c3299f6 of type org.apache.spark.serializer.Child
  - Found 2 class data slot descriptions
  - Looking at desc #1: org.apache.spark.serializer.Parent: static final long serialVersionUID
= 3254964199136071914L;
    - Found 2 fields
      - Ljava/lang/String; a
      - Ljava/lang/String; b
    - getObjFieldValues: 
      - [I@23faa614
      - [I@1cad7d80
      - [I@420a6d35
      - [I@3a87d472
      - [I@2b8ca663
      - java.lang.Object@1effc3eb
{code}
SerializationDebugger#visitSerializable found two fields that belong to the parents, but it
tried to cram the child's values into these two fields. The mismatch of number of fields here
throws the ArrayOutOfBoundExceptions as a result. The culprit is this line: https://github.com/apache/spark/blob/4d9e560b5470029143926827b1cb9d72a0bfbeff/core/src/main/scala/org/apache/spark/serializer/SerializationDebugger.scala#L150,
which runs reflection on the object `Child` even when it's considering the description for
`Parent`.


> SerializationDebugger fails with ArrayOutOfBoundsException
> ----------------------------------------------------------
>
>                 Key: SPARK-7180
>                 URL: https://issues.apache.org/jira/browse/SPARK-7180
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.3.0
>            Reporter: Andrew Or
>
> Simple reproduction:
> {code}
> class Parent extends Serializable {
>   val a = "a"
>   val b = "b"
> }
> class Child extends Parent with Serializable {
>   val c = Array(1)
>   val d = Array(2)
>   val e = Array(3)
>   val f = Array(4)
>   val g = Array(5)
>   val o = new Object
> }
> // ArrayOutOfBoundsException
> SparkEnv.get.closureSerializer.newInstance().serialize(new Child)
> {code}
> I dug into this a little and found that we are trying to fill the fields of `Parent`
with the values of `Child`. See the following output I generated by adding println's everywhere:
> {code}
> * Visiting object org.apache.spark.serializer.Child@2c3299f6 of type org.apache.spark.serializer.Child
>   - Found 2 class data slot descriptions
>   - Looking at desc #1: org.apache.spark.serializer.Parent: static final long serialVersionUID
= 3254964199136071914L;
>     - Found 2 fields
>       - Ljava/lang/String; a
>       - Ljava/lang/String; b
>     - getObjFieldValues: 
>       - [I@23faa614
>       - [I@1cad7d80
>       - [I@420a6d35
>       - [I@3a87d472
>       - [I@2b8ca663
>       - java.lang.Object@1effc3eb
> {code}
> SerializationDebugger#visitSerializable found two fields that belong to the parents,
but it tried to cram the child's values into these two fields. The mismatch of number of fields
here throws the ArrayOutOfBoundExceptions as a result. The culprit is this line: https://github.com/apache/spark/blob/4d9e560b5470029143926827b1cb9d72a0bfbeff/core/src/main/scala/org/apache/spark/serializer/SerializationDebugger.scala#L150,
which runs reflection on the object `Child` even when it's considering the description for
`Parent`.
> I ran into this when trying to serialize a test suite that extends `FunSuite` (don't
ask why).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message