spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Rosen (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-16700) StructType doesn't accept Python dicts anymore
Date Mon, 15 Aug 2016 19:42:20 GMT

     [ https://issues.apache.org/jira/browse/SPARK-16700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Josh Rosen resolved SPARK-16700.
--------------------------------
       Resolution: Fixed
    Fix Version/s: 2.1.0

Issue resolved by pull request 14469
[https://github.com/apache/spark/pull/14469]

> StructType doesn't accept Python dicts anymore
> ----------------------------------------------
>
>                 Key: SPARK-16700
>                 URL: https://issues.apache.org/jira/browse/SPARK-16700
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.0.0
>            Reporter: Sylvain Zimmer
>            Assignee: Davies Liu
>             Fix For: 2.1.0
>
>
> Hello,
> I found this issue while testing my codebase with 2.0.0-rc5
> StructType in Spark 1.6.2 accepts the Python <dict> type, which is very handy.
2.0.0-rc5 does not and throws an error.
> I don't know if this was intended but I'd advocate for this behaviour to remain the same.
MapType is probably wasteful when your key names never change and switching to Python tuples
would be cumbersome.
> Here is a minimal script to reproduce the issue: 
> {code}
> from pyspark import SparkContext
> from pyspark.sql import types as SparkTypes
> from pyspark.sql import SQLContext
> sc = SparkContext()
> sqlc = SQLContext(sc)
> struct_schema = SparkTypes.StructType([
>     SparkTypes.StructField("id", SparkTypes.LongType())
> ])
> rdd = sc.parallelize([{"id": 0}, {"id": 1}])
> df = sqlc.createDataFrame(rdd, struct_schema)
> print df.collect()
> # 1.6.2 prints [Row(id=0), Row(id=1)]
> # 2.0.0-rc5 raises TypeError: StructType can not accept object {'id': 0} in type <type
'dict'>
> {code}
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message