spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcelo Vanzin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-23209) HiveDelegationTokenProvider throws an exception if Hive jars are not the classpath
Date Thu, 25 Jan 2018 17:27:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-23209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Marcelo Vanzin updated SPARK-23209:
-----------------------------------
    Target Version/s: 2.3.0
            Priority: Blocker  (was: Major)

> HiveDelegationTokenProvider throws an exception if Hive jars are not the classpath
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-23209
>                 URL: https://issues.apache.org/jira/browse/SPARK-23209
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.3.0
>         Environment: OSX, Java(TM) SE Runtime Environment (build 1.8.0_92-b14), Java
HotSpot(TM) 64-Bit Server VM (build 25.92-b14, mixed mode)
>            Reporter: Sahil Takiar
>            Priority: Blocker
>
> While doing some Hive-on-Spark testing against the Spark 2.3.0 release candidates we
came across a bug (see HIVE-18436).
> Stack-trace:
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
>         at org.apache.spark.deploy.security.HadoopDelegationTokenManager.getDelegationTokenProviders(HadoopDelegationTokenManager.scala:68)
>         at org.apache.spark.deploy.security.HadoopDelegationTokenManager.<init>(HadoopDelegationTokenManager.scala:54)
>         at org.apache.spark.deploy.yarn.security.YARNHadoopDelegationTokenManager.<init>(YARNHadoopDelegationTokenManager.scala:44)
>         at org.apache.spark.deploy.yarn.Client.<init>(Client.scala:123)
>         at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1502)
>         at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
>         at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
>         at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>         ... 10 more
> {code}
> Looks like the bug was introduced by SPARK-20434. SPARK-20434 changed {{HiveDelegationTokenProvider}}
so that it constructs {{o.a.h.hive.conf.HiveConf}} inside {{HiveCredentialProvider#hiveConf}}
rather than trying to manually load the class via the class loader. Looks like with the new
code the JVM tries to load {{HiveConf}} as soon as {{HiveDelegationTokenProvider}} is referenced.
Since there is no try-catch around the construction of {{HiveDelegationTokenProvider}} a {{ClassNotFoundException}}
is thrown, which causes spark-submit to crash. Spark's {{docs/running-on-yarn.md}} says "a
Hive token will be obtained if Hive is on the classpath". This behavior would seem to contradict
that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message