hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "N Campbell (JIRA)" <>
Subject [jira] [Reopened] (HIVE-9537) string expressions on a fixed length character do not preserve trailing spaces
Date Thu, 19 Feb 2015 01:42:11 GMT


N Campbell reopened HIVE-9537:

The Hive documentation is vague at best with respect to when padding is preserved/ignored:

"Char types are similar to Varchar but they are fixed-length meaning that values shorter than
the specified length value are padded with spaces but trailing spaces are not important during
comparisons. The maximum length is fixed at 255." 

There is no discussion on non-comparison operations such as upper, lower, concat etc.

Consider the following, the driver may return CCHAR will trailing blanks but a string operation
such as concat fails to preserve them. Should an application locally perform a scalar operation
on the returned value such as LEN, LOWER etc then it may retain the spaces. Meanwhile server
side an 'equivalent' expression is not blank preserving.

select rnum, cchar, concat( concat( concat( cchar,'...'), cchar),'...') from tchar. 

So the driver will return BB<spaces> and then BB...BB... for the 2nd and 3rd projected
item. Similarly length(cchar) returns 2 and not 5 etc.

Customers using technologies such as Hana, DB2, Netezza, ... will expect the blank padded
behaviour. To all intents and purposes most SQL persons would not consider the implementation
to be fixed length character.

i.e length(cchar) returns 32

i.e cchar || '...' ..... returns 'BB                              ...BB                  

Should this be the design intent of Hive I would ask for the documentation to be far more
comprehensive is stating the semantics. 

> string expressions on a fixed length character do not preserve trailing spaces
> ------------------------------------------------------------------------------
>                 Key: HIVE-9537
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: SQL
>            Reporter: N Campbell
>            Assignee: Aihua Xu
> When a string expression such as upper or lower is applied to a fixed length column the
trailing spaces of the fixed length character are not preserved.
> {code:sql}
> CREATE TABLE  if not exists TCHAR ( 
> RNUM int, 
> CCHAR char(32)    
> )
> {code}
> {{cchar}} as a {{char(32)}}.
> {code:sql}
> select cchar, concat(cchar, cchar), concat(lower(cchar), cchar), concat(upper(cchar),
> from tchar;
> {code}
> 0|\N
> 1|
> 2| 
> 3|BB
> 4|EE
> 5|FF

This message was sent by Atlassian JIRA

View raw message