hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-613) Timestamp-anchored scanning fails to find all records
Date Fri, 20 Jun 2008 01:59:45 GMT

     [ https://issues.apache.org/jira/browse/HBASE-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jim Kellerman updated HBASE-613:
--------------------------------

    Attachment: 613.patch

HAbstractScanner
- remove HAbstactScanner.iterator() - iterator is not a method on InternalScanner

HRegion
- make getScanner more efficient by iterating only once to find the stores we need to scan
- only pass columns relevant to a store to a HStoreScanner
- remove HScanner.iterator() - iterator is not a method on InternalScanner

MemcacheScanner
- never return HConstants.LATEST_TIMESTAMP as the timestamp value for a row. Instead use the
largest timestamp from the cells being returned. This allows a scanner to determine a timestamp
that can be used to fetch the same data again should new versions be inserted later.

StoreFileScanner
- getNextViableRow would find a row that matched the row key, but did not consider the requested
timestamp. Now if the row it finds has a timestamp greater than the one desired it advances
to determine if a row with a timestamp less than or equal to the requested one exists since
timestamps are sorted descending.
- removed an unnecessary else

Timestamp
- The program that was used to find the problem and test the fix.

TestScanMultipleVersions
- Test program that fails on current trunk but passes when this patch is applied.

NOTE: TestHRegionServerExit failed on both Windows and Linux, but TestRegionRebalancing passed
on Linux and failed on Windows.

All other tests passed, and when I ran TestScanMultipleVersions against unpatched trunk, it
failed.

Please review.


> Timestamp-anchored scanning fails to find all records
> -----------------------------------------------------
>
>                 Key: HBASE-613
>                 URL: https://issues.apache.org/jira/browse/HBASE-613
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: 613.patch, nogood.patch, TestTimestampScanning.java, Timestamp.patch
>
>
> If I add 3 versions of a cell and then scan across the first set of added cells using
a timestamp that should only get values from the first upload, a bunch are missing (I added
100k on each of the three uploads).  I thought it the fact that we set the number of cells
found back to 1 in HStore when we move off current row/column but that doesn't seem to be
it.  I also tried upping the MAX_VERSIONs on my table and that seemed to have no effect. 
Need to look closer.
> Build a unit test because replicating on cluster takes too much time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message