hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Zotov <azo...@griddynamics.com>
Subject Re: concat_ws() UDF
Date Fri, 27 Dec 2013 08:13:13 GMT
Sorry, but in my previous email there was some misleading due to text
formatting.

So, there are tests for concat_ws() UDF:
https://github.com/apache/hive/blob/trunk/ql/src/test/results/clientpositive/udf_concat_ws.q.out.
One of tests (78 line) contains the following query:

> SELECT concat_ws(dest1.c1, dest1.c2, dest1.c3),
>        concat_ws(',', dest1.c1, dest1.c2, dest1.c3),
>        concat_ws(NULL, dest1.c1, dest1.c2, dest1.c3),
>        concat_ws('**', dest1.c1, NULL, dest1.c3) FROM dest1

and expects the following results:

> xyzabc8675309        abc,xyz,8675309        NULL        abc****8675309


I'm confused by the result of the last concat_ws. Why it returns abc****
8675309 instead of abc8675309? It looks like NULL VALUE has not been
skipped.

Thanks.


On Thu, Dec 26, 2013 at 3:36 PM, Alexey Zotov <azotov@griddynamics.com>wrote:

> Hello Guys,
>
> historically concat_ws() UDF was added in the scope of
> https://issues.apache.org/jira/browse/HIVE-682 ticket. I have a simple
> question about its implementation. According to above ticket's description
> it should have behavior like MySQL implementation (
> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_concat-ws).
> MySQL documentation says: "CONCAT_WS()<http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_concat-ws>
does
> not skip empty strings. However, it does skip any NULL values after the
> separator argument.". I have performed a simple test:
>
>> create table test (col1 string);
>> select concat_ws(',', '10', NULL, '4', '', 'a') from test limit 1;
>
>
> as a result I have got:
>
>> 10,4,,a
>
> which looks good for me.
>
> But in tests (
> https://github.com/apache/hive/blob/trunk/ql/src/test/results/clientpositive/udf_concat_ws.q.out
:
> 78 line) I see the following lines:
>
>
>> POSTHOOK: query: SELECT concat_ws(dest1.c1, dest1.c2, dest1.c3),
>>        concat_ws(',', dest1.c1, dest1.c2, dest1.c3),
>>        concat_ws(NULL, dest1.c1, dest1.c2, dest1.c3),
>>        concat_ws('**', dest1.c1, *NULL*, dest1.c3) FROM dest1
>>
>> POSTHOOK: type: QUERY
>> POSTHOOK: Input: default@dest1
>> #### A masked pattern was here ####
>> POSTHOOK: Lineage: dest1.c1 SIMPLE []
>> POSTHOOK: Lineage: dest1.c2 SIMPLE []
>> POSTHOOK: Lineage: dest1.c3 SIMPLE []
>>
>> xyzabc8675309        abc,xyz,8675309        NULL        abc****8675309
>
> which looks like NULLs are not skipped.
>
> What have I missed?
>
> Thanks, Alexey.
>



-- 

Best regards

Zotov Alexey
Grid Dynamics
Skype: azotcsit

Mime
View raw message