spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yh...@apache.org
Subject spark git commit: [SPARK-10847][SQL][PYSPARK] Pyspark - DataFrame - Optional Metadata with `None` triggers cryptic failure
Date Wed, 27 Jan 2016 17:55:15 GMT
Repository: spark
Updated Branches:
  refs/heads/master 41f0c85f9 -> edd473751


[SPARK-10847][SQL][PYSPARK] Pyspark - DataFrame - Optional Metadata with `None` triggers cryptic
failure

The error message is now changed from "Do not support type class scala.Tuple2." to "Do not
support type class org.json4s.JsonAST$JNull$" to be more informative about what is not supported.
Also, StructType metadata now handles JNull correctly, i.e., {'a': None}. test_metadata_null
is added to tests.py to show the fix works.

Author: Jason Lee <cjlee@us.ibm.com>

Closes #8969 from jasoncl/SPARK-10847.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/edd47375
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/edd47375
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/edd47375

Branch: refs/heads/master
Commit: edd473751b59b55fa3daede5ed7bc19ea8bd7170
Parents: 41f0c85
Author: Jason Lee <cjlee@us.ibm.com>
Authored: Wed Jan 27 09:55:10 2016 -0800
Committer: Yin Huai <yhuai@databricks.com>
Committed: Wed Jan 27 09:55:10 2016 -0800

----------------------------------------------------------------------
 python/pyspark/sql/tests.py                                   | 7 +++++++
 .../src/main/scala/org/apache/spark/sql/types/Metadata.scala  | 7 ++++++-
 2 files changed, 13 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/edd47375/python/pyspark/sql/tests.py
----------------------------------------------------------------------
diff --git a/python/pyspark/sql/tests.py b/python/pyspark/sql/tests.py
index 7593b99..410efba 100644
--- a/python/pyspark/sql/tests.py
+++ b/python/pyspark/sql/tests.py
@@ -747,6 +747,13 @@ class SQLTests(ReusedPySparkTestCase):
         except ValueError:
             self.assertEqual(1, 1)
 
+    def test_metadata_null(self):
+        from pyspark.sql.types import StructType, StringType, StructField
+        schema = StructType([StructField("f1", StringType(), True, None),
+                             StructField("f2", StringType(), True, {'a': None})])
+        rdd = self.sc.parallelize([["a", "b"], ["c", "d"]])
+        self.sqlCtx.createDataFrame(rdd, schema)
+
     def test_save_and_load(self):
         df = self.df
         tmpPath = tempfile.mkdtemp()

http://git-wip-us.apache.org/repos/asf/spark/blob/edd47375/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Metadata.scala
----------------------------------------------------------------------
diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Metadata.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Metadata.scala
index 6ee24ee..9e0f994 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Metadata.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Metadata.scala
@@ -156,7 +156,9 @@ object Metadata {
               throw new RuntimeException(s"Do not support array of type ${other.getClass}.")
           }
         }
-      case other =>
+      case (key, JNull) =>
+        builder.putNull(key)
+      case (key, other) =>
         throw new RuntimeException(s"Do not support type ${other.getClass}.")
     }
     builder.build()
@@ -229,6 +231,9 @@ class MetadataBuilder {
     this
   }
 
+  /** Puts a null. */
+  def putNull(key: String): this.type = put(key, null)
+
   /** Puts a Long. */
   def putLong(key: String, value: Long): this.type = put(key, value)
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message