spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <>
Subject [jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
Date Mon, 07 May 2018 14:13:00 GMT


Steve Loughran commented on SPARK-18673:

Good Q, [~Bidek]. That SPARK-23807 POM fixes up the build, but without the mutant org.spark-project.hive
JAR fixed up to not throw an exception whenever Hadoop version == 3, you can't run the code.
including tests. I do have such a fixed up JAR, what I'm proposing here is cherry picking
in the least amount of change needed there.

This is work is part of the overall "spark on Hadoop 3.x". 

Oh and yes, I'm targeting 3.1+ too, though the key issue here is the "3", not the suffix.

What would supercede this is Spark => Hive 2.x. This is an interim artifact until that
is done by someone

> Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
> ------------------------------------------------------------------
>                 Key: SPARK-18673
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0
>         Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT 
>            Reporter: Steve Loughran
>            Priority: Major
> Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader considers
3.x to be an unknown Hadoop version.
> Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it will need
to be updated to match.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message