hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesse Yates (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8809) Include deletes in the scan (setRaw) method does not respect the time range or the filter
Date Fri, 05 Jul 2013 20:39:49 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13701107#comment-13701107

Jesse Yates commented on HBASE-8809:

As slight follow up to this, it feels like raw scans should also ignore the column version/timestamp
filtering. In particular, I'm talking about this section in ScanQueryMatcher:
 MatchCode colChecker = columns.checkColumn(bytes, offset, qualLength,
        timestamp, type, kv.getMemstoreTS() > maxReadPointToTrackVersions);
     * According to current implementation, colChecker can only be
     * SEEK_NEXT_COL, SEEK_NEXT_ROW, SKIP or INCLUDE. Therefore, always return
     * the MatchCode. If it is SEEK_NEXT_ROW, also set stickyNextRow.

Where the ScanWildcardColumnTracker will not ignore the timestamp in the simple case - four
puts to the same row with different timestamps will ignore the oldest by default, even though
its still "present" in the store regardless of the rawness of the scan.

> Include deletes in the scan (setRaw) method does not respect the time range or the filter
> -----------------------------------------------------------------------------------------
>                 Key: HBASE-8809
>                 URL: https://issues.apache.org/jira/browse/HBASE-8809
>             Project: HBase
>          Issue Type: Bug
>          Components: Scanners
>            Reporter: Vasu Mariyala
>            Assignee: Lars Hofhansl
>             Fix For: 0.98.0, 0.95.2, 0.94.10
>         Attachments: 8809-0.94.txt, 8809-trunk.txt, DeleteMarkers.doc
> If a row has been deleted at time stamp 'T' and a scan with time range (0, T-1) is executed,
it still returns the delete marker at time stamp 'T'. It is because of the code in ScanQueryMatcher.java
> {code}
>       if (retainDeletesInOutput
>           || (!isUserScan && (EnvironmentEdgeManager.currentTimeMillis() - timestamp)
<= timeToPurgeDeletes)
>           || kv.getMemstoreTS() > maxReadPointToTrackVersions) {
>         // always include or it is not time yet to check whether it is OK
>         // to purge deltes or not
>         return MatchCode.INCLUDE;
>       }
> {code}
> The assumption is scan (even with setRaw is set to true) should respect the filters and
the time range specified.
> Please let me know if you think this behavior can be changed so that I can provide a
patch for it.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message