hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Fiala (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3170) RegionServer confused about empty row keys
Date Tue, 23 Oct 2012 11:01:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482257#comment-13482257
] 

Martin Fiala commented on HBASE-3170:
-------------------------------------

We are hitting this too, this is a really unexpected behaviour. Why getting empty key should
return data of the first row in table? Reproduced in CDH3u4 (0.90.6):

{code}
hbase(main):005:0> create 'emptykey', {NAME=>'data', VERSION=>1}
0 row(s) in 0.2070 seconds

hbase(main):011:0> get 'emptykey', '' 
COLUMN                         CELL                                                      
                            
0 row(s) in 0.0120 seconds

hbase(main):006:0> put 'emptykey', 'a', 'data:a', '1234'
0 row(s) in 0.1980 seconds

hbase(main):007:0> put 'emptykey', 'b', 'data:b', '5678'
0 row(s) in 0.0070 seconds

hbase(main):008:0> scan 'emptykey'                      
ROW                            COLUMN+CELL                                               
                            
 a                             column=data:a, timestamp=1350989443394, value=1234        
                            
 b                             column=data:b, timestamp=1350989450499, value=5678        
                            
2 row(s) in 0.0660 seconds

hbase(main):009:0> get 'emptykey', ''
COLUMN                         CELL                                                      
                            
 data:a                        timestamp=1350989443394, value=1234                       
                            
1 row(s) in 0.0120 seconds
{code}

It works the same way also using thrift.

We can even see, that empty key is supported in fact.
{code}
hbase(main):012:0> put 'emptykey', '', 'data:c', '90'   
0 row(s) in 0.0130 seconds

hbase(main):013:0> get 'emptykey', ''                
COLUMN                         CELL                                                      
                            
 data:c                        timestamp=1350989869682, value=90                         
                            
1 row(s) in 0.0120 seconds

hbase(main):018:0> scan 'emptykey'   
ROW                            COLUMN+CELL                                               
                            
                               column=data:c, timestamp=1350989869682, value=90          
                            
 a                             column=data:a, timestamp=1350989933922, value=1234        
                            
 b                             column=data:b, timestamp=1350989937820, value=5678        
                            
3 row(s) in 0.0180 seconds
{code}
                
> RegionServer confused about empty row keys
> ------------------------------------------
>
>                 Key: HBASE-3170
>                 URL: https://issues.apache.org/jira/browse/HBASE-3170
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.89.20100621, 0.89.20100924, 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4,
0.90.5, 0.90.6, 0.92.0, 0.92.1
>            Reporter: Benoit Sigoure
>
> I'm no longer sure about the expected behavior when using an empty row key (e.g. a 0-byte
long byte array).  I assumed that this was a legitimate row key, just like having an empty
column qualifier is allowed.  But it seems that the RegionServer considers the empty row key
to be whatever the first row key is.
> {code}
> Version: 0.89.20100830, r0da2890b242584a8a5648d83532742ca7243346b, Sat Sep 18 15:30:09
PDT 2010
> hbase(main):001:0> scan 'tsdb-uid', {LIMIT => 1}
> ROW                           COLUMN+CELL                                           
                              
>  \x00                         column=id:metrics, timestamp=1288375187699, value=foo 
    
>  \x00                         column=id:tagk, timestamp=1287522021046, value=bar    
    
>  \x00                         column=id:tagv, timestamp=1288111387685, value=qux    
 
> 1 row(s) in 0.4610 seconds
> hbase(main):002:0> get 'tsdb-uid', ''
> COLUMN                        CELL                                                  
                              
>  id:metrics                   timestamp=1288375187699, value=foo                    
    
>  id:tagk                      timestamp=1287522021046, value=bar                    
    
>  id:tagv                      timestamp=1288111387685, value=qux                    
 
> 3 row(s) in 0.0910 seconds
> hbase(main):003:0> get 'tsdb-uid', "\000"
> COLUMN                        CELL                                                  
                              
>  id:metrics                   timestamp=1288375187699, value=foo                    
    
>  id:tagk                      timestamp=1287522021046, value=bar                    
    
>  id:tagv                      timestamp=1288111387685, value=qux                    
 
> 3 row(s) in 0.0550 seconds
> {code}
> This isn't a parsing problem with the command-line of the shell.  I can reproduce this
behavior both with plain Java code and with my asynchbase client.
> Since I don't actually have a row with an empty row key, I expected that the first {{get}}
would return nothing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message