hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Juhani Connolly (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2466) Improving filter API to allow for modification of keyvalue list by filter
Date Thu, 22 Apr 2010 03:06:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859630#action_12859630
] 

Juhani Connolly commented on HBASE-2466:
----------------------------------------

I'm changing the new api to return void as suggested.

I'm leaving the hasResults functionality as it is, and filtering keyvals regardless of the
return from nextInternal.

However, I noticed that limit may result in some issues with this filter. With it set one
would only be comparing an incomplete list, thus losing values which haven't been passed yet.

What would be the preferable outcome of this? 
-Document the fact that using Scan.setBatch may result in the API not only comparing each
batch?
-Log a warning or throw an exception?
-Add functionality to a filter that disables batching... Filter#isBatchable along with a check
in HRegion.HRegionScanner#nextInternal

> Improving filter API to allow for modification of keyvalue list by filter
> -------------------------------------------------------------------------
>
>                 Key: HBASE-2466
>                 URL: https://issues.apache.org/jira/browse/HBASE-2466
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: filters, regionserver
>            Reporter: Juhani Connolly
>            Priority: Minor
>         Attachments: HBASE-2466-2.patch, HBASE-2466.patch
>
>
> As it stands, the Filter interface allows filtering by
> Filter#filterAllRemaining() -> true indicates scan is over, false, keep going on.
> Filter#filterRowKey(byte[],int,int) -> true to drop this row, if false, we will also
call
> Filter#filterKeyValue(KeyValue) -> true to drop this key/value
> Filter#filterRow() -> last chance to drop entire row based on the sequence of filterValue()
calls. Eg: filter a row if it doesn't contain a specified column.
> It would be useful to allow for an additional API in the form of a step to prune the
list of KeyValues to be sent by implementing an additional
> Filter#filterRow(List<KeyValue>)
> This would allow for a user to write a custom filter against the api that drops unnecessary
KeyValues according to user-defined rules.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message