hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "david (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-12844) hive-1.2.1 doesn't return correct value when run select count query
Date Tue, 12 Jan 2016 06:11:40 GMT

     [ https://issues.apache.org/jira/browse/HIVE-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

david updated HIVE-12844:
-------------------------
    Description: 
in hbase 1.0.2,I created a table 'test1',it has below rows and values:
hbase(main):027:0> scan 'test1'
ROW                                               COLUMN+CELL                            
                                                                                         
             
 a1                                               column=df1:a2, timestamp=1452505991743,
value=ddd                                                                                
             
 a1                                               column=df1:a3, timestamp=1452506082723,
value=eee                                                                                
             
 a1                                               column=df1:c2, timestamp=1452505705391,
value=bbb                                                                                
             
 b1                                               column=df1:a2, timestamp=1452505838737,
value=ccc                                                                                
             
 b1                                               column=df1:a3, timestamp=1452506149461,
value=fff                                                                                
             
 r1                                               column=df1:a, timestamp=1452507261849, value=hhh
                                                                                         
    
 r1                                               column=df1:a1, timestamp=1452507100774,
value=ggg                                                                                
             
 r1                                               column=df1:c1, timestamp=1451221711588,
value=aaa

then I created hive-1.2.1 table:
create external table test3(
          key string,
          coll string,
          col2 string,
          col3 string,
          col4 string,
          col5 string,
          col6 string)
          STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
          WITH SERDEPROPERTIES
          ("hbase.columns.mapping" =
          ":key,df1:a,df1:1,df1:a2,df1:a3,df1:c1,df1:c2")
          TBLPROPERTIES("hbase.table.name" = "test1"); 

when I run query in hive:
hive> select * from test3;
OK
a1      NULL    NULL    ddd     eee     NULL    bbb
b1      NULL    NULL    ccc      fff        NULL    NULL
r1      hhh        NULL    NULL  NULL  aaa     NULL
the result is correct,but when I run:
select count(1) from test3;
Total MapReduce CPU Time Spent: 6 seconds 770 msec
OK
1
it returns "1",I find that it doesn't count the rows where the first column is null,
Could you help to analyze this?
by the way the hadoop version is 2.6.0

  was:
in hbase 1.0.2,I created a table 'test1',it has below rows and values:
hbase(main):027:0> scan 'test1'
ROW                                               COLUMN+CELL                            
                                                                                         
             
 a1                                               column=df1:a2, timestamp=1452505991743,
value=ddd                                                                                
             
 a1                                               column=df1:a3, timestamp=1452506082723,
value=eee                                                                                
             
 a1                                               column=df1:c2, timestamp=1452505705391,
value=bbb                                                                                
             
 b1                                               column=df1:a2, timestamp=1452505838737,
value=ccc                                                                                
             
 b1                                               column=df1:a3, timestamp=1452506149461,
value=fff                                                                                
             
 r1                                               column=df1:a, timestamp=1452507261849, value=hhh
                                                                                         
    
 r1                                               column=df1:a1, timestamp=1452507100774,
value=ggg                                                                                
             
 r1                                               column=df1:c1, timestamp=1451221711588,
value=aaa

then I created hive-1.2.1 table:
create external table test3(
          key string,
          coll string,
          col2 string,
          col3 string,
          col4 string,
          col5 string,
          col6 string)
          STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
          WITH SERDEPROPERTIES
          ("hbase.columns.mapping" =
          ":key,df1:a,df1:1,df1:a2,df1:a3,df1:c1,df1:c2")
          TBLPROPERTIES("hbase.table.name" = "test1"); 

when I run query in hive:
hive> select * from test3;
OK
a1      NULL    NULL    ddd     eee     NULL    bbb
b1      NULL    NULL    ccc     fff     NULL    NULL
r1      hhh     NULL    NULL    NULL    aaa     NULL
the result is correct,but when I run:
select count(1) from test3;
Total MapReduce CPU Time Spent: 6 seconds 770 msec
OK
1
it returns "1",I find that it doesn't count the rows where the first column is null,
Could you help to analyze this?
by the way the hadoop version is 2.6.0


> hive-1.2.1 doesn't return correct value when run select count query
> -------------------------------------------------------------------
>
>                 Key: HIVE-12844
>                 URL: https://issues.apache.org/jira/browse/HIVE-12844
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 1.2.1
>            Reporter: david
>            Priority: Critical
>
> in hbase 1.0.2,I created a table 'test1',it has below rows and values:
> hbase(main):027:0> scan 'test1'
> ROW                                               COLUMN+CELL                       
                                                                                         
                  
>  a1                                               column=df1:a2, timestamp=1452505991743,
value=ddd                                                                                
             
>  a1                                               column=df1:a3, timestamp=1452506082723,
value=eee                                                                                
             
>  a1                                               column=df1:c2, timestamp=1452505705391,
value=bbb                                                                                
             
>  b1                                               column=df1:a2, timestamp=1452505838737,
value=ccc                                                                                
             
>  b1                                               column=df1:a3, timestamp=1452506149461,
value=fff                                                                                
             
>  r1                                               column=df1:a, timestamp=1452507261849,
value=hhh                                                                                
              
>  r1                                               column=df1:a1, timestamp=1452507100774,
value=ggg                                                                                
             
>  r1                                               column=df1:c1, timestamp=1451221711588,
value=aaa
> then I created hive-1.2.1 table:
> create external table test3(
>           key string,
>           coll string,
>           col2 string,
>           col3 string,
>           col4 string,
>           col5 string,
>           col6 string)
>           STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>           WITH SERDEPROPERTIES
>           ("hbase.columns.mapping" =
>           ":key,df1:a,df1:1,df1:a2,df1:a3,df1:c1,df1:c2")
>           TBLPROPERTIES("hbase.table.name" = "test1"); 
> when I run query in hive:
> hive> select * from test3;
> OK
> a1      NULL    NULL    ddd     eee     NULL    bbb
> b1      NULL    NULL    ccc      fff        NULL    NULL
> r1      hhh        NULL    NULL  NULL  aaa     NULL
> the result is correct,but when I run:
> select count(1) from test3;
> Total MapReduce CPU Time Spent: 6 seconds 770 msec
> OK
> 1
> it returns "1",I find that it doesn't count the rows where the first column is null,
> Could you help to analyze this?
> by the way the hadoop version is 2.6.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message