hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Zotov <azo...@griddynamics.com>
Subject Re: concat_ws() UDF
Date Sat, 28 Dec 2013 08:59:43 GMT
Ouhcc.. It was a very stupid question from my side :( Thanks a lot Ritesh!


On Sat, Dec 28, 2013 at 12:32 PM, Ritesh Agrawal <ragrawal@netflix.com>wrote:

> The first string in concat_ws is used as a separator. Since the third
> value in the last concat_ws is null, it skipped that and simply
> concactenated second and fourth value by **. Hence you got  abc****8675309
>
> Ritesh
>
>
> On Fri, Dec 27, 2013 at 12:13 AM, Alexey Zotov <azotov@griddynamics.com>wrote:
>
>> Sorry, but in my previous email there was some misleading due to text
>> formatting.
>>
>> So, there are tests for concat_ws() UDF:
>> https://github.com/apache/hive/blob/trunk/ql/src/test/results/clientpositive/udf_concat_ws.q.out.
>> One of tests (78 line) contains the following query:
>>
>>> SELECT concat_ws(dest1.c1, dest1.c2, dest1.c3),
>>>        concat_ws(',', dest1.c1, dest1.c2, dest1.c3),
>>>        concat_ws(NULL, dest1.c1, dest1.c2, dest1.c3),
>>>        concat_ws('**', dest1.c1, NULL, dest1.c3) FROM dest1
>>
>> and expects the following results:
>>
>>> xyzabc8675309        abc,xyz,8675309        NULL        abc****8675309
>>
>>
>> I'm confused by the result of the last concat_ws. Why it returns abc****
>> 8675309 instead of abc8675309? It looks like NULL VALUE has not been
>> skipped.
>>
>> Thanks.
>>
>>
>> On Thu, Dec 26, 2013 at 3:36 PM, Alexey Zotov <azotov@griddynamics.com>wrote:
>>
>>> Hello Guys,
>>>
>>> historically concat_ws() UDF was added in the scope of
>>> https://issues.apache.org/jira/browse/HIVE-682 ticket. I have a simple
>>> question about its implementation. According to above ticket's description
>>> it should have behavior like MySQL implementation (
>>> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_concat-ws).
>>> MySQL documentation says: "CONCAT_WS()<http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_concat-ws>
does
>>> not skip empty strings. However, it does skip any NULL values after the
>>> separator argument.". I have performed a simple test:
>>>
>>>> create table test (col1 string);
>>>> select concat_ws(',', '10', NULL, '4', '', 'a') from test limit 1;
>>>
>>>
>>> as a result I have got:
>>>
>>>> 10,4,,a
>>>
>>> which looks good for me.
>>>
>>> But in tests (
>>> https://github.com/apache/hive/blob/trunk/ql/src/test/results/clientpositive/udf_concat_ws.q.out
:
>>> 78 line) I see the following lines:
>>>
>>>
>>>>
>>>>
>>>> POSTHOOK: query: SELECT concat_ws(dest1.c1, dest1.c2, dest1.c3),
>>>>        concat_ws(',', dest1.c1, dest1.c2, dest1.c3),
>>>>        concat_ws(NULL, dest1.c1, dest1.c2, dest1.c3),
>>>>        concat_ws('**', dest1.c1, *NULL*, dest1.c3) FROM dest1
>>>>
>>>>
>>>>
>>>> POSTHOOK: type: QUERY
>>>> POSTHOOK: Input: default@dest1
>>>> #### A masked pattern was here ####
>>>> POSTHOOK: Lineage: dest1.c1 SIMPLE []
>>>> POSTHOOK: Lineage: dest1.c2 SIMPLE []
>>>> POSTHOOK: Lineage: dest1.c3 SIMPLE []
>>>>
>>>>
>>>>
>>>> xyzabc8675309        abc,xyz,8675309        NULL        abc****8675309
>>>
>>> which looks like NULLs are not skipped.
>>>
>>> What have I missed?
>>>
>>> Thanks, Alexey.
>>>
>>
>>
>>
>> --
>>
>> Best regards
>>
>> Zotov Alexey
>> Grid Dynamics
>> Skype: azotcsit
>>
>
>


-- 

Best regards

Zotov Alexey
Grid Dynamics
Skype: azotcsit

Mime
View raw message