hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rui Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12045) ClassNotFound for GenericUDF in "select distinct..." query (Hive on Spark)
Date Tue, 17 Nov 2015 12:04:10 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15008550#comment-15008550
] 

Rui Li commented on HIVE-12045:
-------------------------------

I cherry picked HIVE-12229 to latest master code and then applied the patch here. All the
spark on yarn tests can pass this way. Seems we still need to investigate this on spark branch.
I'll do some debugging.

> ClassNotFound for GenericUDF in "select distinct..." query (Hive on Spark)
> --------------------------------------------------------------------------
>
>                 Key: HIVE-12045
>                 URL: https://issues.apache.org/jira/browse/HIVE-12045
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>         Environment: Cloudera QuickStart VM - CDH5.4.2
> beeline
>            Reporter: Zsolt Tóth
>            Assignee: Rui Li
>         Attachments: HIVE-12045.1-spark.patch, HIVE-12045.2-spark.patch, example.jar,
genUDF.patch, hive.log.gz
>
>
> If I execute the following query in beeline, I get ClassNotFoundException for the UDF
class.
> {code}
> drop function myGenericUdf;
> create function myGenericUdf as 'org.example.myGenericUdf' using jar 'hdfs:///tmp/myudf.jar';
> select distinct myGenericUdf(1,2,1) from mytable;
> {code}
> In my example, myGenericUdf just looks for the 1st argument's value in the others and
returns the index. I don't think this is related to the actual GenericUDF function.
> Note that:
> "select myGenericUdf(1,2,1) from mytable;" succeeds
> If I use the non-generic implementation of the same UDF, the select distinct call succeeds.
> StackTrace:
> {code}
> 15/10/06 05:20:25 ERROR exec.Utilities: Failed to load plan: hdfs://quickstart.cloudera:8020/tmp/hive/hive/f9de3f09-c12d-4528-9ee6-1f12932a14ae/hive_2015-10-06_05-20-07_438_6519207588897968406-20/-mr-10003/27cd7226-3e22-46f4-bddd-fb8fd4aa4b8d/map.xml:
org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: org.example.myGenericUDF
> Serialization trace:
> genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: org.example.myGenericUDF
> Serialization trace:
> genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> 	at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> 	at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> 	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> 	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> 	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> 	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
> 	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> 	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> 	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> 	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
> 	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> 	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
> 	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
> 	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
> 	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
> 	at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:1069)
> 	at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:960)
> 	at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:974)
> 	at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:416)
> 	at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:296)
> 	at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:268)
> 	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:505)
> 	at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:203)
> 	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
> 	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
> 	at scala.Option.getOrElse(Option.scala:120)
> 	at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
> 	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
> 	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
> 	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
> 	at scala.Option.getOrElse(Option.scala:120)
> 	at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
> 	at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82)
> 	at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:80)
> 	at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:206)
> 	at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:204)
> 	at scala.Option.getOrElse(Option.scala:120)
> 	at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204)
> 	at org.apache.spark.scheduler.DAGScheduler.visit$2(DAGScheduler.scala:338)
> 	at org.apache.spark.scheduler.DAGScheduler.getAncestorShuffleDependencies(DAGScheduler.scala:355)
> 	at org.apache.spark.scheduler.DAGScheduler.registerShuffleDependencies(DAGScheduler.scala:317)
> 	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$getShuffleMapStage(DAGScheduler.scala:218)
> 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$visit$1$1.apply(DAGScheduler.scala:301)
> 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$visit$1$1.apply(DAGScheduler.scala:298)
> 	at scala.collection.immutable.List.foreach(List.scala:318)
> 	at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:298)
> 	at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:310)
> 	at org.apache.spark.scheduler.DAGScheduler.newStage(DAGScheduler.scala:244)
> 	at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:731)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1362)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
> 	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> Caused by: java.lang.ClassNotFoundException: org.example.myGenericUDF
> 	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> 	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> 	at java.lang.Class.forName0(Native Method)
> 	at java.lang.Class.forName(Class.java:270)
> 	at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
> 	... 72 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message