hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1531) Add RowFilter to HRegion.HScanner
Date Thu, 05 Jul 2007 19:52:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510462
] 

stack commented on HADOOP-1531:
-------------------------------

Committed with below message.  Thanks for the contribution James.

HADOOP-1531 Add RowFilter to HRegion.HScanner.
Adds a row/column filter interface and two implementations: A pager and a
row/column-value regex filter.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInterface.java
    (openScanner): Add override that specifies a row fliter.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HClient.java
    (obtainScanner): Add override that specifies a row fliter.
    (ColumnScanner): Add filter parameter to constructor.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
    (getScanner): Add override with filter parameter.
    (next): Add handling of filtering.
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/filter/InvalidRowFilterException.java
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/filter/RegExpRowFilter.java
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/filter/RowFilterSet.java
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/filter/PageRowFilter.java
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/filter/RowFilterInterface.java
    Row-filter interface, exception and implementations.
A src/contrib/hbase/src/test/org/apache/hadoop/hbase/filter/TestRegExpRowFilter.java
A src/contrib/hbase/src/test/org/apache/hadoop/hbase/filter/TestPageRowFilter.java
    Simple pager and regex filter tests.

> Add RowFilter to HRegion.HScanner
> ---------------------------------
>
>                 Key: HADOOP-1531
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1531
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>    Affects Versions: 0.14.0
>            Reporter: James Kennedy
>            Assignee: James Kennedy
>         Attachments: code-style-formatter, eclipse.preferences, RowFilter-v2.patch, RowFilter-v3.patch,
RowFilter-v4.patch, RowFilter.patch
>
>
> I've implemented a RowFilterInterface and a RowFilter implementation.  This is passed
to the HRegion.HScanner via HClient.openScanner() though it is an entirely optional parameter.
> HScanner applies the filter in the next() call by iterating until it encounters a row
that is not filtered by the RowFilter.  The filter applies criteria based on row keys and/or
column data values.
> Null values are little tricky since the resultSet in that loop may represent nulls as
absent columns or as DELETED_BYTES.  Nevertheless null cases are taken care of by the filter
and you can for example retrieve all rows where column X = null.
> The initial RowFilter implementation is limited in several ways:
> * Equality test only with literal values. No !=, <, >, etc. No col1 == col2. This
is a straight-up byte[] comparison.
> * Multiple column criteria are treated as an implicit conjunction, no disjunction possible.
> * row key criteria is a regular expression only
> * row key criteria is independent of column criteria. No "if rowkey.matches(A)  and col1==B"
 although the interface is created to allow for that.
> But it should be easy to write an improved RowFilterInterface implementation to take
care of most of the above without having to change code elsewhere.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message