spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dav...@apache.org
Subject spark git commit: [SPARK-16926] [SQL] Remove partition columns from partition metadata.
Date Thu, 01 Sep 2016 21:13:38 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-2.0 13bacd730 -> ac22ab077


[SPARK-16926] [SQL] Remove partition columns from partition metadata.

## What changes were proposed in this pull request?

This removes partition columns from column metadata of partitions to match tables.

A change introduced in SPARK-14388 removed partition columns from the column metadata of tables,
but not for partitions. This causes TableReader to believe that the schema is different between
table and partition, and create an unnecessary conversion object inspector in TableReader.

## How was this patch tested?

Existing unit tests.

Author: Brian Cho <bcho@fb.com>

Closes #14515 from dafrista/partition-columns-metadata.

(cherry picked from commit 473d78649dec7583bcc4ec24b6f38303c38e81a2)
Signed-off-by: Davies Liu <davies.liu@gmail.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ac22ab07
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ac22ab07
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ac22ab07

Branch: refs/heads/branch-2.0
Commit: ac22ab0779c8672ba622b90304f05ac44ff83819
Parents: 13bacd7
Author: Brian Cho <bcho@fb.com>
Authored: Thu Sep 1 14:13:17 2016 -0700
Committer: Davies Liu <davies.liu@gmail.com>
Committed: Thu Sep 1 14:13:35 2016 -0700

----------------------------------------------------------------------
 .../scala/org/apache/spark/sql/hive/MetastoreRelation.scala  | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/ac22ab07/sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala
----------------------------------------------------------------------
diff --git a/sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala b/sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala
index f8ebe08..88d8d4b 100644
--- a/sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala
+++ b/sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala
@@ -163,7 +163,13 @@ private[hive] case class MetastoreRelation(
 
       val sd = new org.apache.hadoop.hive.metastore.api.StorageDescriptor()
       tPartition.setSd(sd)
-      sd.setCols(catalogTable.schema.map(toHiveColumn).asJava)
+
+      // Note: In Hive the schema and partition columns must be disjoint sets
+      val schema = catalogTable.schema.map(toHiveColumn).filter { c =>
+        !catalogTable.partitionColumnNames.contains(c.getName)
+      }
+      sd.setCols(schema.asJava)
+
       p.storage.locationUri.foreach(sd.setLocation)
       p.storage.inputFormat.foreach(sd.setInputFormat)
       p.storage.outputFormat.foreach(sd.setOutputFormat)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message