hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2438) Addition of a Column Pagination Filter
Date Tue, 13 Apr 2010 21:15:53 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856619#action_12856619

Jonathan Gray commented on HBASE-2438:

Patch looks pretty good.

One problem I see is that when you deserialize you set page and pageSize, but the offset (which
is used) is not calculated like it is in the non-empty constructor. So looks like this will
break once serialized (if I read the code correctly). Is there a way we can also have a unit
test which would show that?

Also, in the main filter call:

+ public ReturnCode filterKeyValue(KeyValue v)
+ {
+ ReturnCode code = (count < offset || count >= offset + pageSize) ? ReturnCode.SKIP
: ReturnCode.INCLUDE;
+ count++;
+ return code;
+ }

This code is a bit hard to read. Also, once you are passed the offset + pageSize, shouldn't
you be sending the ReturnCode.NEXT_ROW since you don't want to include anything more in the
current row? Count isn't used once you get passed the offset either so maybe this could be
rewritten more like:

+ public ReturnCode filterKeyValue(KeyValue v)
+ {
+ if(count >= offset + pageSize) {
+ return ReturnCode.NEXT_ROW;
+ }
+ ReturnCode code = count < offset ? ReturnCode.SKIP : ReturnCode.INCLUDE;
+ count++;
+ return code;
+ }

Good stuff guys.

> Addition of a Column Pagination Filter
> --------------------------------------
>                 Key: HBASE-2438
>                 URL: https://issues.apache.org/jira/browse/HBASE-2438
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: filters
>    Affects Versions: 0.20.3
>            Reporter: Paul Kist
>         Attachments: hbase-2438-0.20.3.patch
>   Original Estimate: 8h
>  Remaining Estimate: 8h
> Client applications may need to do pagination, depending on the number of columns returned,
it may be more efficient to perform pagination algorithms at the database level (similar to
SQL's LIMIT and OFFSET).  This will be an additional filter taking two parameters:
> - page
> - pageSize
> For every row, that gets returned, only a subset of columns are returned based on page
and pageSize
> If the page / pageSize column goes over the limits, then no results are returned from
the filter.
> A practical example for using a filter like this may be for folks doing Row-based indexing
with Hbase.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message