spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-26303) Return partial results for bad JSON records
Date Fri, 07 Dec 2018 10:13:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-26303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16712603#comment-16712603
] 

Apache Spark commented on SPARK-26303:
--------------------------------------

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/23253

> Return partial results for bad JSON records
> -------------------------------------------
>
>                 Key: SPARK-26303
>                 URL: https://issues.apache.org/jira/browse/SPARK-26303
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Maxim Gekk
>            Priority: Minor
>
> Currently, JSON datasource and JSON functions return row with all null for a malformed
JSON string in the PERMISSIVE mode when specified schema has the struct type. All nulls are
returned even some of fields were parsed and converted to desired types successfully. The
ticket aims to solve the problem by returning already parsed fields. The corrupted column
specified via JSON option `columnNameOfCorruptRecord` or SQL config should contain whole original
JSON string. 
> For example, if the input has one JSON string:
> {code:json}
> {"a":0.1,"b":{},"c":"def"}
> {code}
> and specified schema is:
> {code:sql}
> a DOUBLE, b ARRAY<INT>, c STRING, _corrupt_record STRIN
> {code}
> expected output of `from_json` in the PERMISSIVE mode:
> {code}
> +---+----+---+--------------------------+
> |a  |b   |c  |_corrupt_record           |
> +---+----+---+--------------------------+
> |0.1|null|def|{"a":0.1,"b":{},"c":"def"}|
> +---+----+---+--------------------------+
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message