spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao Li (JIRA)" <>
Subject [jira] [Commented] (SPARK-22771) SQL concat for binary
Date Thu, 14 Dec 2017 00:46:19 GMT


Xiao Li commented on SPARK-22771:

This looks reasonable. We should fix it.

> SQL concat for binary 
> ----------------------
>                 Key: SPARK-22771
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.2.1
>            Reporter: Fernando Pereira
>            Priority: Minor
> spark.sql {{concat}}  function automatically casts arguments to StringType and returns
a String.
> This might be the behavior of traditional databases, however in Spark there's Binary
as a standard type, and concat'ing binary seems reasonable if it returns another binary sequence.
> Taking the example of, e.g. Python where both {{bytes}} and {{unicode}} represent text,
by concat'ing both we end up with the same type as the arguments, and in case they are intermixed
(str + unicode) the most generic type is returned (unicode).
> Following the same principle, I believe that when concat'ing binary it would make sense
to return a binary. 
> In terms of Spark behavior, it would affect only the case when all arguments are binary.
All other cases should remain unchanged.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message