spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Herman van Hovell (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-18634) Corruption and Correctness issues with exploding Python UDFs
Date Tue, 06 Dec 2016 01:51:58 GMT

     [ https://issues.apache.org/jira/browse/SPARK-18634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Herman van Hovell resolved SPARK-18634.
---------------------------------------
       Resolution: Fixed
         Assignee: Liang-Chi Hsieh
    Fix Version/s: 2.1.0
                   2.0.3

> Corruption and Correctness issues with exploding Python UDFs
> ------------------------------------------------------------
>
>                 Key: SPARK-18634
>                 URL: https://issues.apache.org/jira/browse/SPARK-18634
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 2.0.2, 2.1.0
>            Reporter: Burak Yavuz
>            Assignee: Liang-Chi Hsieh
>             Fix For: 2.0.3, 2.1.0
>
>
> There are some weird issues with exploding Python UDFs in SparkSQL.
> There are 2 cases where based on the DataType of the exploded column, the result can
be flat out wrong, or corrupt. Seems like something bad is happening when telling Tungsten
the schema of the rows during or after applying the UDF.
> Please check the code below for reproduction.
> Notebook: https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/6186780348633019/3425836135165635/4343791953238323/latest.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message