Return-Path: X-Original-To: apmail-spark-issues-archive@minotaur.apache.org Delivered-To: apmail-spark-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6C6FE1744C for ; Tue, 28 Apr 2015 04:04:06 +0000 (UTC) Received: (qmail 59209 invoked by uid 500); 28 Apr 2015 04:04:06 -0000 Delivered-To: apmail-spark-issues-archive@spark.apache.org Received: (qmail 59178 invoked by uid 500); 28 Apr 2015 04:04:06 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 59169 invoked by uid 99); 28 Apr 2015 04:04:06 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Apr 2015 04:04:06 +0000 Date: Tue, 28 Apr 2015 04:04:05 +0000 (UTC) From: "Andrew Or (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (SPARK-7180) SerializationDebugger fails with ArrayOutOfBoundsException MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SPARK-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-7180: ----------------------------- Description: Simple reproduction: {code} class Parent extends Serializable { val a = "a" val b = "b" } class Child extends Parent with Serializable { val c = Array(1) val d = Array(2) val e = Array(3) val f = Array(4) val g = Array(5) val o = new Object } // ArrayOutOfBoundsException SparkEnv.get.closureSerializer.newInstance().serialize(new Child) {code} I dug into this a little and found that we are trying to fill the fields of `Parent` with the values of `Child`. See the following output I generated by adding println's everywhere: {code} * Visiting object org.apache.spark.serializer.Child@2c3299f6 of type org.apache.spark.serializer.Child - Found 2 class data slot descriptions - Looking at desc #1: org.apache.spark.serializer.Parent: static final long serialVersionUID = 3254964199136071914L; - Found 2 fields - Ljava/lang/String; a - Ljava/lang/String; b - getObjFieldValues: - [I@23faa614 - [I@1cad7d80 - [I@420a6d35 - [I@3a87d472 - [I@2b8ca663 - java.lang.Object@1effc3eb {code} SerializationDebugger#visitSerializable found two fields that belong to the parents, but it tried to cram the child's values into these two fields. The mismatch of number of fields here throws the ArrayOutOfBoundExceptions as a result. The culprit is this line: https://github.com/apache/spark/blob/4d9e560b5470029143926827b1cb9d72a0bfbeff/core/src/main/scala/org/apache/spark/serializer/SerializationDebugger.scala#L150, which runs reflection on the object `Child` even when it's considering the description for `Parent`. I ran into this when trying to serialize a test suite that extends `FunSuite` (don't ask why). was: Simple reproduction: {code} class Parent extends Serializable { val a = "a" val b = "b" } class Child extends Parent with Serializable { val c = Array(1) val d = Array(2) val e = Array(3) val f = Array(4) val g = Array(5) val o = new Object } // ArrayOutOfBoundsException SparkEnv.get.closureSerializer.newInstance().serialize(new Child) {code} I dug into this a little and found that we are trying to fill the fields of `Parent` with the values of `Child`. See the following output I generated by adding println's everywhere: {code} * Visiting object org.apache.spark.serializer.Child@2c3299f6 of type org.apache.spark.serializer.Child - Found 2 class data slot descriptions - Looking at desc #1: org.apache.spark.serializer.Parent: static final long serialVersionUID = 3254964199136071914L; - Found 2 fields - Ljava/lang/String; a - Ljava/lang/String; b - getObjFieldValues: - [I@23faa614 - [I@1cad7d80 - [I@420a6d35 - [I@3a87d472 - [I@2b8ca663 - java.lang.Object@1effc3eb {code} SerializationDebugger#visitSerializable found two fields that belong to the parents, but it tried to cram the child's values into these two fields. The mismatch of number of fields here throws the ArrayOutOfBoundExceptions as a result. The culprit is this line: https://github.com/apache/spark/blob/4d9e560b5470029143926827b1cb9d72a0bfbeff/core/src/main/scala/org/apache/spark/serializer/SerializationDebugger.scala#L150, which runs reflection on the object `Child` even when it's considering the description for `Parent`. > SerializationDebugger fails with ArrayOutOfBoundsException > ---------------------------------------------------------- > > Key: SPARK-7180 > URL: https://issues.apache.org/jira/browse/SPARK-7180 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.3.0 > Reporter: Andrew Or > > Simple reproduction: > {code} > class Parent extends Serializable { > val a = "a" > val b = "b" > } > class Child extends Parent with Serializable { > val c = Array(1) > val d = Array(2) > val e = Array(3) > val f = Array(4) > val g = Array(5) > val o = new Object > } > // ArrayOutOfBoundsException > SparkEnv.get.closureSerializer.newInstance().serialize(new Child) > {code} > I dug into this a little and found that we are trying to fill the fields of `Parent` with the values of `Child`. See the following output I generated by adding println's everywhere: > {code} > * Visiting object org.apache.spark.serializer.Child@2c3299f6 of type org.apache.spark.serializer.Child > - Found 2 class data slot descriptions > - Looking at desc #1: org.apache.spark.serializer.Parent: static final long serialVersionUID = 3254964199136071914L; > - Found 2 fields > - Ljava/lang/String; a > - Ljava/lang/String; b > - getObjFieldValues: > - [I@23faa614 > - [I@1cad7d80 > - [I@420a6d35 > - [I@3a87d472 > - [I@2b8ca663 > - java.lang.Object@1effc3eb > {code} > SerializationDebugger#visitSerializable found two fields that belong to the parents, but it tried to cram the child's values into these two fields. The mismatch of number of fields here throws the ArrayOutOfBoundExceptions as a result. The culprit is this line: https://github.com/apache/spark/blob/4d9e560b5470029143926827b1cb9d72a0bfbeff/core/src/main/scala/org/apache/spark/serializer/SerializationDebugger.scala#L150, which runs reflection on the object `Child` even when it's considering the description for `Parent`. > I ran into this when trying to serialize a test suite that extends `FunSuite` (don't ask why). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org