hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1531) Add RowFilter to HRegion.HScanner
Date Thu, 05 Jul 2007 19:52:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510462

stack commented on HADOOP-1531:

Committed with below message.  Thanks for the contribution James.

HADOOP-1531 Add RowFilter to HRegion.HScanner.
Adds a row/column filter interface and two implementations: A pager and a
row/column-value regex filter.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInterface.java
    (openScanner): Add override that specifies a row fliter.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HClient.java
    (obtainScanner): Add override that specifies a row fliter.
    (ColumnScanner): Add filter parameter to constructor.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
    (getScanner): Add override with filter parameter.
    (next): Add handling of filtering.
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/filter/InvalidRowFilterException.java
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/filter/RegExpRowFilter.java
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/filter/RowFilterSet.java
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/filter/PageRowFilter.java
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/filter/RowFilterInterface.java
    Row-filter interface, exception and implementations.
A src/contrib/hbase/src/test/org/apache/hadoop/hbase/filter/TestRegExpRowFilter.java
A src/contrib/hbase/src/test/org/apache/hadoop/hbase/filter/TestPageRowFilter.java
    Simple pager and regex filter tests.

> Add RowFilter to HRegion.HScanner
> ---------------------------------
>                 Key: HADOOP-1531
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1531
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>    Affects Versions: 0.14.0
>            Reporter: James Kennedy
>            Assignee: James Kennedy
>         Attachments: code-style-formatter, eclipse.preferences, RowFilter-v2.patch, RowFilter-v3.patch,
RowFilter-v4.patch, RowFilter.patch
> I've implemented a RowFilterInterface and a RowFilter implementation.  This is passed
to the HRegion.HScanner via HClient.openScanner() though it is an entirely optional parameter.
> HScanner applies the filter in the next() call by iterating until it encounters a row
that is not filtered by the RowFilter.  The filter applies criteria based on row keys and/or
column data values.
> Null values are little tricky since the resultSet in that loop may represent nulls as
absent columns or as DELETED_BYTES.  Nevertheless null cases are taken care of by the filter
and you can for example retrieve all rows where column X = null.
> The initial RowFilter implementation is limited in several ways:
> * Equality test only with literal values. No !=, <, >, etc. No col1 == col2. This
is a straight-up byte[] comparison.
> * Multiple column criteria are treated as an implicit conjunction, no disjunction possible.
> * row key criteria is a regular expression only
> * row key criteria is independent of column criteria. No "if rowkey.matches(A)  and col1==B"
 although the interface is created to allow for that.
> But it should be easy to write an improved RowFilterInterface implementation to take
care of most of the above without having to change code elsewhere.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message