hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Hu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-18368) FilterList with multiple FamilyFilters concatenated by OR does not work.
Date Tue, 17 Oct 2017 10:22:00 GMT

     [ https://issues.apache.org/jira/browse/HBASE-18368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Zheng Hu updated HBASE-18368:
    Attachment: next-row-behavior-in-regionScanner-and-storeScanner.jpg

I uploaded an image to describe the next row behavior in RegionScanner and StoreScanner .


Assume that there are two column families: cf1, cf2 . For StoreScanner, the NEXT_ROW return
code will skip to the next row in current familly cf1 (As the red line shows). But for RegionScanner,
 StoreScanner of cf1 will skip to the next row in family cf1, and our storeHeap in RegionScanner
will choose the minimal store scanner which will be familly cf2 to read the next cell. So
for RegionScanner, it actually do two steps: skip to the next row in family cf1, and switch
our storescanner in regionScanner to cf2, that's the reason why we can optimize FamillyFilter
by NEXT_ROW returncode (As the blue line shows).

Here, we can define the NEXT_ROW return code more clearly: In CF-level, NEXT_ROW will skip
to the next row in current familly, and In Region-level, NEXT_ROW will skip to the next row
in current family and switch to the next family for RegionScanner. 

So patch for this issue will be easy: 

1. Make the NEXT_ROW definition more clear in JavaDoc. 
2. Keep behavior match the definition of NEXT_ROW. 

> FilterList with multiple FamilyFilters concatenated by OR does not work.
> ------------------------------------------------------------------------
>                 Key: HBASE-18368
>                 URL: https://issues.apache.org/jira/browse/HBASE-18368
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Filters
>    Affects Versions: 3.0.0, 2.0.0-alpha-1
>            Reporter: Peter Somogyi
>            Assignee: Zheng Hu
>            Priority: Critical
>         Attachments: HBASE-18368.branch-1.patch, HBASE-18368.branch-1.v2.patch, HBASE-18368.branch-1.v3.patch,
HBASE-18368.patch, HBASE-18368.v2.patch, HBASE-18368.v3.patch, HBASE-18368.v3.patch, next-row-behavior-in-regionScanner-and-storeScanner.jpg
> Scan gives back incomplete list if multiple filters are combined with OR / MUST_PASS_ONE.
> Using 2 FamilyFilters in a FilterList using MUST_PASS_ONE operator will give back results
for only the first Filter.
> {code:java|title=Test code}
>   @Test
>   public void testFiltersWithOr() throws Exception {
>     TableName tn = TableName.valueOf("MyTest");
>     Table table = utility.createTable(tn, new String[] {"cf1", "cf2"});
>     byte[] CF1 = Bytes.toBytes("cf1");
>     byte[] CF2 = Bytes.toBytes("cf2");
>     Put put1 = new Put(Bytes.toBytes("0"));
>     put1.addColumn(CF1, Bytes.toBytes("col_a"), Bytes.toBytes(0));
>     table.put(put1);
>     Put put2 = new Put(Bytes.toBytes("0"));
>     put2.addColumn(CF2, Bytes.toBytes("col_b"), Bytes.toBytes(0));
>     table.put(put2);
>     FamilyFilter filterCF1 = new FamilyFilter(CompareFilter.CompareOp.EQUAL, new BinaryComparator(CF1));
>     FamilyFilter filterCF2 = new FamilyFilter(CompareFilter.CompareOp.EQUAL, new BinaryComparator(CF2));
>     FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ONE);
>     filterList.addFilter(filterCF1);
>     filterList.addFilter(filterCF2);
>     Scan scan = new Scan();
>     scan.setFilter(filterList);
>     ResultScanner scanner = table.getScanner(scan);
>     System.out.println(filterList);
>     for (Result rr = scanner.next(); rr != null; rr = scanner.next()) {
>       System.out.println(rr);
>     }
>   }
> {code}
> {noformat:title=Output}
> FilterList OR (2/2): [FamilyFilter (EQUAL, cf1), FamilyFilter (EQUAL, cf2)]
> keyvalues={0/cf1:col_a/1499852754957/Put/vlen=4/seqid=0}
> {noformat}

This message was sent by Atlassian JIRA

View raw message