hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Newman (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-6912) Filters are not properly applied to scans, to the first entry in the scan.
Date Mon, 01 Oct 2012 23:47:07 GMT

     [ https://issues.apache.org/jira/browse/HBASE-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alex Newman updated HBASE-6912:
-------------------------------

    Description: 
Steps to reproduce:

Create a table, load data into it. Flush the table.

Do a scan with
1. Some filter which should not match the first entry in the scan
2. Where one specifies a family and column.
You will notice that the first entry is returned even though it doesn't match the filter.

It looks like the when the first KeyValue of a scan in the column from the point of view of
the code

HRegion.java
} else if (kv != null && !kv.isInternal() && filterRowKey(currentRow)) {
Is generated by
(THE FIRST ENTRY IS STILL INTERNAL AT THIS POINT)

public static KeyValue createLastOnRow(final byte [] row,
final int roffset, final int rlength, final byte [] family,
final int foffset, final int flength, final byte [] qualifier,
final int qoffset, final int qlength) { return new KeyValue(row, roffset, rlength, family,
foffset, flength, qualifier, qoffset, qlength, HConstants.OLDEST_TIMESTAMP, Type.Minimum,
null, 0, 0); }
So it is always internal from that point of the code.

Only later from within
StoreScanner.java
public synchronized boolean next(List<KeyValue> outResult, int limit, String metric)
throws IOException {
....
LOOP: while((kv = this.heap.peek()) != null) {
( The second time through)

Do we get the actual kv, with a proper type and timestamp. This seems to mess with filtering.


  was:
Steps to reproduce:

Create a table, load data into it. Flush the table.

Do a scan with
1. Some filter which should not match the first entry in the scan
2. Where one specifies a family and column.
You will notice that the first entry is returned even though it doesn't match the filter.

It looks like the when the first KeyValue of a scan in the column from the point of view of
the code

HRegion.java
} else if (kv != null && !kv.isInternal() && filterRowKey(currentRow)) {
Is generated by

public static KeyValue createLastOnRow(final byte [] row,
final int roffset, final int rlength, final byte [] family,
final int foffset, final int flength, final byte [] qualifier,
final int qoffset, final int qlength) { return new KeyValue(row, roffset, rlength, family,
foffset, flength, qualifier, qoffset, qlength, HConstants.OLDEST_TIMESTAMP, Type.Minimum,
null, 0, 0); }
So it is always internal from that point of the code.

Only later from within
StoreScanner.java
public synchronized boolean next(List<KeyValue> outResult, int limit, String metric)
throws IOException {
....
LOOP: while((kv = this.heap.peek()) != null) {
( The second time through)

Do we get the actual kv, with a proper type and timestamp. This seems to mess with filtering.


        Summary: Filters are not properly applied to scans, to the first entry in the scan.
  (was: Filters are not properly applied in certain cases)
    
> Filters are not properly applied to scans, to the first entry in the scan. 
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-6912
>                 URL: https://issues.apache.org/jira/browse/HBASE-6912
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.1
>            Reporter: Alex Newman
>         Attachments: minimalTest.java
>
>
> Steps to reproduce:
> Create a table, load data into it. Flush the table.
> Do a scan with
> 1. Some filter which should not match the first entry in the scan
> 2. Where one specifies a family and column.
> You will notice that the first entry is returned even though it doesn't match the filter.
> It looks like the when the first KeyValue of a scan in the column from the point of view
of the code
> HRegion.java
> } else if (kv != null && !kv.isInternal() && filterRowKey(currentRow))
{
> Is generated by
> (THE FIRST ENTRY IS STILL INTERNAL AT THIS POINT)
> public static KeyValue createLastOnRow(final byte [] row,
> final int roffset, final int rlength, final byte [] family,
> final int foffset, final int flength, final byte [] qualifier,
> final int qoffset, final int qlength) { return new KeyValue(row, roffset, rlength, family,
foffset, flength, qualifier, qoffset, qlength, HConstants.OLDEST_TIMESTAMP, Type.Minimum,
null, 0, 0); }
> So it is always internal from that point of the code.
> Only later from within
> StoreScanner.java
> public synchronized boolean next(List<KeyValue> outResult, int limit, String metric)
throws IOException {
> ....
> LOOP: while((kv = this.heap.peek()) != null) {
> ( The second time through)
> Do we get the actual kv, with a proper type and timestamp. This seems to mess with filtering.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message