hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Izaak Rubin (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-737) Scanner: every cell in a row has the same timestamp
Date Fri, 11 Jul 2008 23:50:31 GMT

    [ https://issues.apache.org/jira/browse/HBASE-737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613019#action_12613019

Izaak Rubin commented on HBASE-737:

I've done some investigating into the timestamp discrepancies.  In HRegionServer.next(long),
HStoreScanner.next(HStoreKey, Map<byte[],byte[]>) is called once per row to retrieve
Cell data for that row.  The HStoreKey contains the name of the row and a *single* timestamp
for that row.  When HRegionServer.next() constructs the actual Cell objects for a row, it
uses the same single timestamp from the HStoreKey.  This is why the scanners return the same
timestamp for every Cell in a row.  

It looks like, in order to fix the problem, the HStoreScanner will have to store more cell
information.  Does the HStoreKey even need to store a timestamp if timestamps aren't unique
to a row?

> Scanner: every cell in a row has the same timestamp
> ---------------------------------------------------
>                 Key: HBASE-737
>                 URL: https://issues.apache.org/jira/browse/HBASE-737
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.2.0
>            Reporter: Izaak Rubin
>            Priority: Minor
> A row can have multiple cells, and each cell can have a different timestamp.  The get
command in the shell demonstrates that cells are being stored with different timestamps:
> {code}
> hbase(main):008:0> get 'table1', 'row2'  
> COLUMN                       CELL 
>  fam1:letters                timestamp=1215707612949, value=def 
>  fam1:numbers                timestamp=1215707629064, value=123 
>  fam2:letters                timestamp=1215711498969, value=abc 
> 3 row(s) in 0.0100 seconds
> {code}
> However, using the scanners to retrieve these cells shows that they all have the same
> {code}
> hbase(main):009:0> scan 'table1'  
> ROW                          COLUMN+CELL
>  row2                        column=fam1:letters, timestamp=1215711498969, value=def

>  row2                        column=fam1:numbers, timestamp=1215711498969, value=123

>  row2                        column=fam2:letters, timestamp=1215711498969, value=abc

> 3 row(s) in 0.0600 seconds
> {code}
> The scanners are losing timestamp information somewhere along the line.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message