hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1906) FilterList of prefix and columnvalue not working properly with deletes and multiple values
Date Fri, 16 Oct 2009 05:22:31 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766420#action_12766420
] 

stack commented on HBASE-1906:
------------------------------

Here is some more detail on this issue.

The illustrative code put up a table with 5 column families and added values.  It then set
up a scanner that used a FilterList of two Filters against one of the column families.  The
first filter was a prefix filter.  The second a test on the cell content.  The behavior wanted
was that only rows that matched the prefix and the supplied cell value should be returned.

Before the fix was applied, we would do the right thing -- return rows that matched on prefix
and cell value -- but then we'd tag onto the resultset part of a row; its rowid would match
the prefix filter but it would not have the required cell content.   We'd return all columns
that sorted before the column that had the cell the filter was testing.

The illustrating code then threw in deletes of the cell we were testing on but we were still
returning the part row (IIRC).

What was happening was that there was a code path whereby we could leave the internal next
loop without calling the filter filterRow method.  This latter method, if given the chance,
was knocking out rows that didn't match on both supplied filters.  Skipping out without its
invocation was letting out candidate results that should have been suppressed.

Here is the old code:

{code}
1745     private boolean nextInternal() throws IOException {
1746       // This method should probably be reorganized a bit... has gotten messy
1747       KeyValue kv;
1748       byte[] currentRow = null;
1749       boolean filterCurrentRow = false;
1750       while (true) {
1751         kv = this.storeHeap.peek();
1752         if (kv == null) {
1753           return false;
1754         }
1755         byte [] row = kv.getRow();
1756         if (filterCurrentRow && Bytes.equals(currentRow, row)) {
1757           // filter all columns until row changes
1758           this.storeHeap.next(results);
1759           results.clear();
1760           continue;
1761         }
1762         // see if current row should be filtered based on row key
1763         if ((filter != null && filter.filterRowKey(row, 0, row.length)) ||
1764             (oldFilter != null && oldFilter.filterRowKey(row, 0, row.length)))
{
1765           if(!results.isEmpty() && !Bytes.equals(currentRow, row)) {
1766             return true;
1767           }
1768           this.storeHeap.next(results);
1769           results.clear();
1770           resetFilters();
1771           filterCurrentRow = true;
1772           currentRow = row;
1773           continue;
1774         }
1775         if(!Bytes.equals(currentRow, row)) {
1776           // Continue on the next row:
1777           currentRow = row;
1778           filterCurrentRow = false;
1779           // See if we passed stopRow
1780           if(stopRow != null &&
1781               comparator.compareRows(stopRow, 0, stopRow.length,
1782                   currentRow, 0, currentRow.length) <= 0) {
1783             return false;
1784           }
1785           // if there are _no_ results or current row should be filtered
1786           if (results.isEmpty() || filter != null && filter.filterRow()) {
1787             // make sure results is empty
1788             results.clear();
1789             resetFilters();
1790             continue;
1791           }
1792           return true;
1793         }
1794         this.storeHeap.next(results);
1795       }
1796     }
1797 
1798     public void close() {
1799       storeHeap.close();
1800     }
{code}

We would exit at #1766 without calling filter.filterRow rather than at #1792.

The above method was rewritten so we don't skip out without calling filterRow.

{code}
    private boolean nextInternal() throws IOException {
      byte [] currentRow = null;
      boolean filterCurrentRow = false;
      while (true) {
        KeyValue kv = this.storeHeap.peek();
        if (kv == null) return false;
        byte [] row = kv.getRow();
        boolean samerow = Bytes.equals(currentRow, row);
        if (samerow && filterCurrentRow) {
          // Filter all columns until row changes
          readAndDumpCurrentResult();
          continue;
        }
        if (!samerow) {
          // Continue on the next row:
          currentRow = row;
          filterCurrentRow = false;
          // See if we passed stopRow
          if (this.stopRow != null &&
              comparator.compareRows(this.stopRow, 0, this.stopRow.length,
                currentRow, 0, currentRow.length) <= 0) {
            return false;
          }
          if (hasResults()) return true;
        }
        // See if current row should be filtered based on row key
        if (this.filter != null && this.filter.filterRowKey(row, 0, row.length)) {
          readAndDumpCurrentResult();
          resetFilters();
          filterCurrentRow = true;
          currentRow = row;
          continue;
        }
        this.storeHeap.next(results);
      }
    }
{code}

> FilterList of prefix and columnvalue not working properly with deletes and multiple values
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1906
>                 URL: https://issues.apache.org/jira/browse/HBASE-1906
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.20.2, 0.21.0
>
>         Attachments: 1906-v2.patch, 1906-v3.patch, 1906-v4.patch, filterlist.patch
>
>
> Attached are some unit tests from client and region that demonstrate the failing issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message