Return-Path: X-Original-To: apmail-hive-issues-archive@minotaur.apache.org Delivered-To: apmail-hive-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5C91E18A51 for ; Fri, 20 Nov 2015 22:03:11 +0000 (UTC) Received: (qmail 60681 invoked by uid 500); 20 Nov 2015 22:03:11 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 60658 invoked by uid 500); 20 Nov 2015 22:03:11 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 60612 invoked by uid 99); 20 Nov 2015 22:03:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Nov 2015 22:03:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 133FF2C1F6E for ; Fri, 20 Nov 2015 22:03:11 +0000 (UTC) Date: Fri, 20 Nov 2015 22:03:11 +0000 (UTC) From: "Xuefu Zhang (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-12045) ClassNotFound for GenericUDF in "select distinct..." query (Hive on Spark) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-12045?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1501= 8886#comment-15018886 ]=20 Xuefu Zhang commented on HIVE-12045: ------------------------------------ +1 to latest patch. > ClassNotFound for GenericUDF in "select distinct..." query (Hive on Spark= ) > -------------------------------------------------------------------------= - > > Key: HIVE-12045 > URL: https://issues.apache.org/jira/browse/HIVE-12045 > Project: Hive > Issue Type: Bug > Components: Spark > Environment: Cloudera QuickStart VM - CDH5.4.2 > beeline > Reporter: Zsolt T=C3=B3th > Assignee: Rui Li > Attachments: HIVE-12045.1-spark.patch, HIVE-12045.2-spark.patch, = HIVE-12045.2-spark.patch, HIVE-12045.3-spark.patch, HIVE-12045.4-spark.patc= h, HIVE-12045.patch, example.jar, genUDF.patch, hive.log.gz > > > If I execute the following query in beeline, I get ClassNotFoundException= for the UDF class. > {code} > drop function myGenericUdf; > create function myGenericUdf as 'org.example.myGenericUdf' using jar 'hdf= s:///tmp/myudf.jar'; > select distinct myGenericUdf(1,2,1) from mytable; > {code} > In my example, myGenericUdf just looks for the 1st argument's value in th= e others and returns the index. I don't think this is related to the actual= GenericUDF function. > Note that: > "select myGenericUdf(1,2,1) from mytable;" succeeds > If I use the non-generic implementation of the same UDF, the select disti= nct call succeeds. > StackTrace: > {code} > 15/10/06 05:20:25 ERROR exec.Utilities: Failed to load plan: hdfs://quick= start.cloudera:8020/tmp/hive/hive/f9de3f09-c12d-4528-9ee6-1f12932a14ae/hive= _2015-10-06_05-20-07_438_6519207588897968406-20/-mr-10003/27cd7226-3e22-46f= 4-bddd-fb8fd4aa4b8d/map.xml: org.apache.hive.com.esotericsoftware.kryo.Kryo= Exception: Unable to find class: org.example.myGenericUDF > Serialization trace: > genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) > chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) > colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator) > childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator) > childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator) > aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork) > org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find c= lass: org.example.myGenericUDF > Serialization trace: > genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) > chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) > colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator) > childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator) > childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator) > aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork) > =09at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver= .readName(DefaultClassResolver.java:138) > =09at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver= .readClass(DefaultClassResolver.java:115) > =09at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:= 656) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.r= ead(ObjectField.java:99) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializ= er.read(FieldSerializer.java:507) > =09at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(K= ryo.java:776) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSer= ializer.read(CollectionSerializer.java:112) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSer= ializer.read(CollectionSerializer.java:18) > =09at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java= :694) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.r= ead(ObjectField.java:106) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializ= er.read(FieldSerializer.java:507) > =09at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(K= ryo.java:776) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer= .read(MapSerializer.java:139) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer= .read(MapSerializer.java:17) > =09at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java= :694) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.r= ead(ObjectField.java:106) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializ= er.read(FieldSerializer.java:507) > =09at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(K= ryo.java:776) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSer= ializer.read(CollectionSerializer.java:112) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSer= ializer.read(CollectionSerializer.java:18) > =09at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java= :694) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.r= ead(ObjectField.java:106) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializ= er.read(FieldSerializer.java:507) > =09at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(K= ryo.java:776) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSer= ializer.read(CollectionSerializer.java:112) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSer= ializer.read(CollectionSerializer.java:18) > =09at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java= :694) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.r= ead(ObjectField.java:106) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializ= er.read(FieldSerializer.java:507) > =09at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(K= ryo.java:776) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer= .read(MapSerializer.java:139) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer= .read(MapSerializer.java:17) > =09at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java= :694) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.r= ead(ObjectField.java:106) > =09at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializ= er.read(FieldSerializer.java:507) > =09at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java= :672) > =09at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Ut= ilities.java:1069) > =09at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.= java:960) > =09at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.= java:974) > =09at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java= :416) > =09at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:= 296) > =09at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.j= ava:268) > =09at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(Combi= neHiveInputFormat.java:505) > =09at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:203) > =09at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) > =09at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) > =09at scala.Option.getOrElse(Option.scala:120) > =09at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) > =09at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRD= D.scala:32) > =09at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) > =09at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) > =09at scala.Option.getOrElse(Option.scala:120) > =09at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) > =09at org.apache.spark.ShuffleDependency.(Dependency.scala:82) > =09at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:= 80) > =09at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:20= 6) > =09at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:20= 4) > =09at scala.Option.getOrElse(Option.scala:120) > =09at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204) > =09at org.apache.spark.scheduler.DAGScheduler.visit$2(DAGScheduler.scala:= 338) > =09at org.apache.spark.scheduler.DAGScheduler.getAncestorShuffleDependenc= ies(DAGScheduler.scala:355) > =09at org.apache.spark.scheduler.DAGScheduler.registerShuffleDependencies= (DAGScheduler.scala:317) > =09at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$= DAGScheduler$$getShuffleMapStage(DAGScheduler.scala:218) > =09at org.apache.spark.scheduler.DAGScheduler$$anonfun$visit$1$1.apply(DA= GScheduler.scala:301) > =09at org.apache.spark.scheduler.DAGScheduler$$anonfun$visit$1$1.apply(DA= GScheduler.scala:298) > =09at scala.collection.immutable.List.foreach(List.scala:318) > =09at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:= 298) > =09at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGSchedule= r.scala:310) > =09at org.apache.spark.scheduler.DAGScheduler.newStage(DAGScheduler.scala= :244) > =09at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGSched= uler.scala:731) > =09at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(D= AGScheduler.scala:1362) > =09at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(D= AGScheduler.scala:1354) > =09at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > Caused by: java.lang.ClassNotFoundException: org.example.myGenericUDF > =09at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > =09at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > =09at java.security.AccessController.doPrivileged(Native Method) > =09at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > =09at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > =09at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > =09at java.lang.Class.forName0(Native Method) > =09at java.lang.Class.forName(Class.java:270) > =09at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver= .readName(DefaultClassResolver.java:136) > =09... 72 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)