spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guilherme Berger (JIRA)" <>
Subject [jira] [Commented] (SPARK-22566) Better error message for `_merge_type` in Pandas to Spark DF conversion
Date Mon, 20 Nov 2017 19:08:00 GMT


Guilherme Berger commented on SPARK-22566:


> Better error message for `_merge_type` in Pandas to Spark DF conversion
> -----------------------------------------------------------------------
>                 Key: SPARK-22566
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 2.2.0
>            Reporter: Guilherme Berger
>            Priority: Minor
> When creating a Spark DF from a Pandas DF without specifying a schema, schema inference
is used. This inference can fail when a column contains values of two different types; this
is ok. The problem is the error message does not tell us in which column this happened.
> When this happens, it is painful to debug since the error message is too vague.
> I plan on submitting a PR which fixes this, providing a better error message for such
cases, containing the column name (and possibly the problematic values too).
> >>> spark_session.createDataFrame(pandas_df)
> File "redacted/pyspark/sql/", line 541, in createDataFrame
>       rdd, schema = self._createFromLocal(map(prepare, data), schema)
> File "redacted/pyspark/sql/", line 401, in _createFromLocal
>       struct = self._inferSchemaFromList(data)
> File "redacted/pyspark/sql/", line 333, in _inferSchemaFromList
>       schema = reduce(_merge_type, map(_infer_schema, data))
> File "redacted/pyspark/sql/", line 1124, in _merge_type
>       for f in a.fields]
> File "redacted/pyspark/sql/", line 1118, in _merge_type
>       raise TypeError("Can not merge type %s and %s" % (type(a), type(b)))
> TypeError: Can not merge type <class 'pyspark.sql.types.LongType'> and <class

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message