hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward Capriolo (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1863) Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive JDBC driver.
Date Thu, 23 Dec 2010 19:15:46 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974701#action_12974701
] 

Edward Capriolo commented on HIVE-1863:
---------------------------------------

Right. Something gets "lost in translation" here because Hive Service is using the ArrayList<String>
that comes from driver.getResults() which is designed for sending output to the final console
not another serde. We should be able to re-write string and place the nulls back in carefully,
however a better solution might exists.

> Boolean columns in Hive tables containing NULL are treated as FALSE by the Hive JDBC
driver.
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1863
>                 URL: https://issues.apache.org/jira/browse/HIVE-1863
>             Project: Hive
>          Issue Type: Bug
>          Components: Drivers
>    Affects Versions: 0.5.0
>            Reporter: Guy le Mar
>            Assignee: Edward Capriolo
>
> (1) Using the Hive CLI, create a table using...
> create table dt4_boolean
> (
>         dt4_id          int,
>         dt4_testbool    boolean,
>         dt4_string      string
> )
> row format delimited
>         fields  terminated by ','
>         lines  terminated by '\n';
> (2) Create a file containing the following text...
> 1,true,Value is True
> 2,null,Data says null and must be null
> 3,,No value that means null
> 4,NoIdea,Data says NoIdea that's gonna be null
> 5,false,Value is FALSE
> (3) Load the data in the file into the Hive table...
> load data local inpath '<DATA FILE PATH>' overwrite into table dt4_boolean;
> (4) Check the table works as expected using the Hive CLI...
> hive> select * from dt4_boolean;
> OK
> 1	true	Value is True
> 2	NULL	Data says null and must be null
> 3	NULL	No value that means null
> 4	NULL	Data says NoIdea that's gonna be null
> 5	false	Value is FALSE
> Time taken: 0.049 seconds
> (5) Using the Hive JDBC driver, execute the same Hive query (select * from dt4_boolean)
> (5.1) The "row_str" values obtained by the Hive JDBC driver for deserialization are correct...
> 1	true	Value is True
> 2	NULL	Data says null and must be null
> 3	NULL	No value that means null
> 4	NULL	Data says NoIdea that's gonna be null
> 5	false	Value is FALSE
> (5.2) However, when these "row_str" are deserialized by the DynamicSerDe to a java.lang.Object,
the NULL boolean values are converted to FALSE - instead of being null.
> As a consequence, the application making use of the Hive JDBC driver produces this (incorrect)
output...
> SQL> select dt4_id, dt4_testbool from dt4_boolean;
>     DT4_ID DT4_TESTBOOL
> ---------- ------------
>          1            true
>          2            false
>          3            false
>          4            false
>          5            false
> ...instead of producing this (correct) output...
> SQL> select dt4_id, dt4_testbool from dt4_boolean;
>     DT4_ID DT4_TESTBOOL
> ---------- ------------
>          1            true
>          2            NULL
>          3            NULL
>          4            NULL
>          5            false

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message