hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Vovchenko (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-16741) Counting number of records in hive and hbase are different for NULL fields in hive
Date Tue, 23 May 2017 16:33:04 GMT
Aleksey Vovchenko created HIVE-16741:
----------------------------------------

             Summary:  Counting number of records in hive and hbase are different for NULL
fields in hive
                 Key: HIVE-16741
                 URL: https://issues.apache.org/jira/browse/HIVE-16741
             Project: Hive
          Issue Type: Bug
          Components: Hive
    Affects Versions: 2.1.0, 1.2.0
            Reporter: Aleksey Vovchenko
            Assignee: Aleksey Vovchenko


Steps to reproduce:

STEP 1.  

hbase> create 'testTable',{NAME=>'cf'}

STEP 2.
put 'testTable','10','cf:Address','My Address 411002'
put 'testTable','10','cf:contactId','653638'
put 'testTable','10','cf:currentStatus','Awaiting'
put 'testTable','10','cf:createdAt','1452815193'
put 'testTable','10','cf:Id','10'


put 'testTable','15','cf:contactId','653638'
put 'testTable','15','cf:currentStatus','Awaiting'
put 'testTable','15','cf:createdAt','1452815193'
put 'testTable','15','cf:Id','15'
(Note: Here Addrees column is not provided.It means that NULL.)

put 'testTable','20','cf:Address','My Address 411003'
put 'testTable','20','cf:contactId','653638'
put 'testTable','20','cf:currentStatus','Awaiting'
put 'testTable','20','cf:createdAt','1452815193'
put 'testTable','20','cf:Id','20'


put 'testTable','17','cf:Address','My Address 411003'
put 'testTable','17','cf:currentStatus','Awaiting'
put 'testTable','17','cf:createdAt','1452815193'
put 'testTable','17','cf:Id','17'

STEP 3.

hive> CREATE external TABLE hh_testTable(Id string,Address string,contactId string,currentStatus
string,createdAt string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH
SERDEPROPERTIES ("hbase.columns.mapping"=":key,cf:Address,cf:contactId,cf:currentStatus,cf:createdAt")
TBLPROPERTIES ("hbase.table.name"="testTable");

STEP 4.

hive> select count(*),contactid from hh_testTable group by contactid;

Actual result:
OK
3	653638

Expected result:
OK
1	NULL
3	653637




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message