hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Usein Faradzhev (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-9050) NULL values for empty strings when joining with ORC table
Date Tue, 09 Dec 2014 08:01:13 GMT

    [ https://issues.apache.org/jira/browse/HIVE-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239108#comment-14239108
] 

Usein Faradzhev commented on HIVE-9050:
---------------------------------------

Example

CREATE TABLE master AS SELECT id FROM default.dual LATERAL VIEW explode(split('1,2', ','))
s AS id;
CREATE TABLE detail(id int, str string) STORED AS ORC TBLPROPERTIES("orc.compress"="SNAPPY");
INSERT INTO TABLE detail SELECT 1 AS id, str FROM default.dual LATERAL VIEW explode(split(',',
',')) s AS str;

Values is empty
SELECT
d.*,
(CASE WHEN d.str IS NULL THEN 'IS_NULL'
WHEN d.str = '' THEN 'IS_EMPTY'
ELSE 'NOT_EMPTY'
END) value_type
FROM detail d;
d.id    d.str   value_type
1               IS_EMPTY
1               IS_EMPTY

Value is NULL instead of an empty values
SELECT
d.*,
(CASE WHEN d.str IS NULL THEN 'IS_NULL'
WHEN d.str = '' THEN 'IS_EMPTY'
ELSE 'NOT_EMPTY'
END) value_type
FROM detail d
JOIN master m ON m.id = d.id;
d.id    d.str   value_type
1       NULL    IS_NULL
1       NULL    IS_NULL

If to use textfile format, all query returns an empty values
CREATE TABLE detail_txt(id int, str string) STORED AS TEXTFILE;
INSERT INTO TABLE detail_txt SELECT 1 AS id, str FROM default.dual LATERAL VIEW explode(split(',',
',')) s AS str;
SELECT
d.*,
(CASE WHEN d.str IS NULL THEN 'IS_NULL'
WHEN d.str = '' THEN 'IS_EMPTY'
ELSE 'NOT_EMPTY'
END) value_type
FROM detail_txt d;
d.id    d.str   value_type
1               IS_EMPTY
1               IS_EMPTY

SELECT
d.*,
(CASE WHEN d.str IS NULL THEN 'IS_NULL'
WHEN d.str = '' THEN 'IS_EMPTY'
ELSE 'NOT_EMPTY'
END) value_type
FROM detail_txt d
JOIN master m ON m.id = d.id;
d.id    d.str   value_type
1               IS_EMPTY
1               IS_EMPTY


> NULL values for empty strings when joining with ORC table
> ---------------------------------------------------------
>
>                 Key: HIVE-9050
>                 URL: https://issues.apache.org/jira/browse/HIVE-9050
>             Project: Hive
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 0.13.0
>         Environment: CentOS release 6.4 (Final), Hortonworks 2.1, Tez
> Hive 0.13.0.2.1.3.0-563
> Subversion git://ip-10-0-0-91/grid/0/jenkins/workspace/BIGTOP-HDP_RPM_REPO-baikal-GA-centos6/bigtop/build/hive/rpm/BUILD/h
ive-0.13.0.2.1.3.0 -r a738a76c72d6d9dd304691faada57a94429256bc
> Compiled by jenkins on Thu Jun 26 18:28:50 EDT 2014
> From source with checksum 4dbd99dd254f0c521ad8ab072045325d
>            Reporter: Usein Faradzhev
>
> When ORC table contains an empty strings and the SQL query contains at least one join
a hive returns NULL instead of empty values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message