spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Feng Zhu (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-21774) The rule PromoteStrings cast string to a wrong data type
Date Fri, 18 Aug 2017 08:42:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-21774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16131908#comment-16131908
] 

Feng Zhu edited comment on SPARK-21774 at 8/18/17 8:41 AM:
-----------------------------------------------------------

This is introduced by the PR-15880  (https://github.com/apache/spark/pull/15880), which addresses
SPARK-17913. This PR tries to follow logic in PG.
"I think it's more reasonable to follow postgres in this case, i.e. cast string to the type
of the other side, but return null if the string is not castable to keep hive compatibility."

However, *UTF8String* still returns true for such case. From the below code, we can get res=true
and wrapper.value=0
  
{code:java}
val x = UTF8String.fromString("0.1")
val wrapper = new IntWrapper
val res = x.toInt(wrapper)
{code}

Shall we check and change such similar beheviors, or back to the logic in 2.1, which casts
String into DoubleType?
@[~LI,Xiao]    @[~cloud_fan]


was (Author: donnyzone):
This is introduced by the PR-15880  (https://github.com/apache/spark/pull/15880), which addresses
SPARK-17913. This PR tries to follow logic in PG.
"I think it's more reasonable to follow postgres in this case, i.e. cast string to the type
of the other side, but return null if the string is not castable to keep hive compatibility."

However, *UTF8String* still returns true for such case. From the below code, we can get res=true
and wrapper.value=0
  
{code:java}
val x = UTF8String.fromString("0.1")
val wrapper = new IntWrapper
val res = x.toInt(wrapper)
{code}

Shall we check such similar beheviors, or back to the logic in 2.1, which casts String into
DoubleType?
@[~LI,Xiao]    @[~cloud_fan]

> The rule PromoteStrings cast string to a wrong data type
> --------------------------------------------------------
>
>                 Key: SPARK-21774
>                 URL: https://issues.apache.org/jira/browse/SPARK-21774
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: StanZhai
>            Priority: Critical
>              Labels: correctness
>
> Data
> {code}
> create temporary view tb as select * from values
> ("0", 1),
> ("-0.1", 2),
> ("1", 3)
> as grouping(a, b)
> {code}
> SQL:
> {code}
> select a, b from tb where a=0
> {code}
> The result which is wrong:
> {code}
> +----+---+
> |   a|  b|
> +----+---+
> |   0|  1|
> |-0.1|  2|
> +----+---+
> {code}
> Logical Plan:
> {code}
> == Parsed Logical Plan ==
> 'Project ['a]
> +- 'Filter ('a = 0)
>    +- 'UnresolvedRelation `src`
> == Analyzed Logical Plan ==
> a: string
> Project [a#8528]
> +- Filter (cast(a#8528 as int) = 0)
>    +- SubqueryAlias src
>       +- Project [_1#8525 AS a#8528, _2#8526 AS b#8529]
>          +- LocalRelation [_1#8525, _2#8526]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message