From issues-return-200448-archive-asf-public=cust-asf.ponee.io@spark.apache.org Thu Aug 30 09:37:04 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 323E3180656 for ; Thu, 30 Aug 2018 09:37:04 +0200 (CEST) Received: (qmail 31654 invoked by uid 500); 30 Aug 2018 07:37:03 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 31645 invoked by uid 99); 30 Aug 2018 07:37:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Aug 2018 07:37:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id D144CC6E6D for ; Thu, 30 Aug 2018 07:37:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -110.301 X-Spam-Level: X-Spam-Status: No, score=-110.301 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id fkXUqVZFSjdx for ; Thu, 30 Aug 2018 07:37:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 113F15F343 for ; Thu, 30 Aug 2018 07:37:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 41DAFE0111 for ; Thu, 30 Aug 2018 07:37:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 05A672183F for ; Thu, 30 Aug 2018 07:37:00 +0000 (UTC) Date: Thu, 30 Aug 2018 07:37:00 +0000 (UTC) From: "Zhichao Zhang (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (SPARK-25279) Throw exception: zzcclp java.io.NotSerializableException: org.apache.spark.sql.TypedColumn in Spark-shell when run example of doc MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Zhichao Zhang created SPARK-25279: -------------------------------------- Summary: Throw exception: zzcclp java.io.NotSerializableExce= ption: org.apache.spark.sql.TypedColumn in Spark-shell when run example of = doc Key: SPARK-25279 URL: https://issues.apache.org/jira/browse/SPARK-25279 Project: Spark Issue Type: Bug Components: Spark Shell, SQL Affects Versions: 2.2.1 Reporter: Zhichao Zhang Hi dev:=C2=A0 =C2=A0 I am using Spark-Shell to run the example which is in section=C2=A0 '[http://spark.apache.org/docs/2.2.2/sql-programming-guide.html#type-safe-u= ser-defined-aggregate-functions'],=C2=A0 and there is an error:=C2=A0 {code:java} Caused by: java.io.NotSerializableException:=C2=A0 org.apache.spark.sql.TypedColumn=C2=A0 Serialization stack:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object not serializable (class: org.apache.sp= ark.sql.TypedColumn, value:=C2=A0 myaverage() AS `average_salary`)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class: $iw, name: averageSalary, type:= class=C2=A0 org.apache.spark.sql.TypedColumn)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class $iw, $iw@4b2f8ae9)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class: MyAverage$, name: $outer, type:= class $iw)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class MyAverage$, MyAverage$@2be41d90= )=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class:=C2=A0 org.apache.spark.sql.execution.aggregate.ComplexTypedAggregateExpression,= =C2=A0 name: aggregator, type: class org.apache.spark.sql.expressions.Aggregator)= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class=C2=A0 org.apache.spark.sql.execution.aggregate.ComplexTypedAggregateExpression,= =C2=A0 MyAverage(Employee))=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class:=C2=A0 org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression,=C2= =A0 name: aggregateFunction, type: class=C2=A0 org.apache.spark.sql.catalyst.expressions.aggregate.AggregateFunction)=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class=C2=A0 org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression,=C2= =A0 partial_myaverage(MyAverage$@2be41d90, Some(newInstance(class Employee)),= =C2=A0 Some(class Employee), Some(StructType(StructField(name,StringType,true),=C2= =A0 StructField(salary,LongType,false))), assertnotnull(assertnotnull(input[0,= =C2=A0 Average, true])).sum AS sum#25L, assertnotnull(assertnotnull(input[0,=C2=A0 Average, true])).count AS count#26L, newInstance(class Average), input[0,= =C2=A0 double, false] AS value#24, DoubleType, false, 0, 0))=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - writeObject data (class:=C2=A0 scala.collection.immutable.List$SerializationProxy)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class scala.collection.immutable.List= $SerializationProxy,=C2=A0 scala.collection.immutable.List$SerializationProxy@5e92c46f)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - writeReplace data (class:=C2=A0 scala.collection.immutable.List$SerializationProxy)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class scala.collection.immutable.$col= on$colon,=C2=A0 List(partial_myaverage(MyAverage$@2be41d90, Some(newInstance(class=C2=A0 Employee)), Some(class Employee),=C2=A0 Some(StructType(StructField(name,StringType,true),=C2=A0 StructField(salary,LongType,false))), assertnotnull(assertnotnull(input[0,= =C2=A0 Average, true])).sum AS sum#25L, assertnotnull(assertnotnull(input[0,=C2=A0 Average, true])).count AS count#26L, newInstance(class Average), input[0,= =C2=A0 double, false] AS value#24, DoubleType, false, 0, 0)))=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class:=C2=A0 org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec, name:=C2= =A0 aggregateExpressions, type: interface scala.collection.Seq)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class=C2=A0 org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec,=C2=A0 ObjectHashAggregate(keys=3D[],=C2=A0 functions=3D[partial_myaverage(MyAverage$@2be41d90, Some(newInstance(class= =C2=A0 Employee)), Some(class Employee),=C2=A0 Some(StructType(StructField(name,StringType,true),=C2=A0 StructField(salary,LongType,false))), assertnotnull(assertnotnull(input[0,= =C2=A0 Average, true])).sum AS sum#25L, assertnotnull(assertnotnull(input[0,=C2=A0 Average, true])).count AS count#26L, newInstance(class Average), input[0,= =C2=A0 double, false] AS value#24, DoubleType, false, 0, 0)], output=3D[buf#37])= =C2=A0 +- *FileScan json [name#8,salary#9L] Batched: false, Format: JSON, Location= :=C2=A0 InMemoryFileIndex[file:/opt/spark2/examples/src/main/resources/employees.js= on],=C2=A0 PartitionFilters: [], PushedFilters: [], ReadSchema:=C2=A0 struct=C2=A0 )=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class:=C2=A0 org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$d= oExecute$1,=C2=A0 name: $outer, type: class=C2=A0 org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class=C2=A0 org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$d= oExecute$1,=C2=A0 )=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class:=C2=A0 org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$d= oExecute$1$$anonfun$2,=C2=A0 name: $outer, type: class=C2=A0 org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$d= oExecute$1)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class=C2=A0 org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$d= oExecute$1$$anonfun$2,=C2=A0 )=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class: org.apache.spark.rdd.RDD$$anonf= un$mapPartitionsInternal$1,=C2=A0 name: f$23, type: interface scala.Function1)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class org.apache.spark.rdd.RDD$$anonf= un$mapPartitionsInternal$1,=C2=A0 )=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class:=C2=A0 org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25= ,=C2=A0 name: $outer, type: class=C2=A0 org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class=C2=A0 org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25= ,=C2=A0 )=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class: org.apache.spark.rdd.MapPartiti= onsRDD, name: f, type:=C2=A0 interface scala.Function3)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class org.apache.spark.rdd.MapPartiti= onsRDD, MapPartitionsRDD[9]=C2=A0 at show at :62)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class: org.apache.spark.NarrowDependen= cy, name: _rdd, type: class=C2=A0 org.apache.spark.rdd.RDD)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class org.apache.spark.OneToOneDepend= ency,=C2=A0 org.apache.spark.OneToOneDependency@5bb7895)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - writeObject data (class:=C2=A0 scala.collection.immutable.List$SerializationProxy)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class scala.collection.immutable.List= $SerializationProxy,=C2=A0 scala.collection.immutable.List$SerializationProxy@6e81dca3)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - writeReplace data (class:=C2=A0 scala.collection.immutable.List$SerializationProxy)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class scala.collection.immutable.$col= on$colon,=C2=A0 List(org.apache.spark.OneToOneDependency@5bb7895))=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class: org.apache.spark.rdd.RDD, name:= =C2=A0 org$apache$spark$rdd$RDD$$dependencies_, type: interface=C2=A0 scala.collection.Seq)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class org.apache.spark.rdd.MapPartiti= onsRDD, MapPartitionsRDD[10]=C2=A0 at show at :62)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - field (class: scala.Tuple2, name: _1, type: c= lass java.lang.Object)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - object (class scala.Tuple2, (MapPartitionsRDD= [10] at show at=C2=A0 :62,org.apache.spark.ShuffleDependency@421cd28))=C2=A0 =C2=A0 at=C2=A0 org.apache.spark.serializer.SerializationDebugger$.improveException(Seriali= zationDebugger.scala:40)=C2=A0{code} =C2=A0 =C2=A0 But if I use idea to run the example directly, it works. What is=C2= =A0the=C2=A0=C2=A0 difference among them? How I run the example sucessfully on Spark-Shell?=C2= =A0 =C2=A0 Thanks.=C2=A0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org