hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Juhani Connolly (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2466) Improving filter API to allow for modification of keyvalue list by filter
Date Wed, 28 Apr 2010 08:47:32 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861728#action_12861728
] 

Juhani Connolly commented on HBASE-2466:
----------------------------------------

I intend to include an optional compare operation for DependentCompareFilter.

The constructor would be like this:

  /**
   * Build a dependent column filter with value checking
   * dependent column varies will be compared using the supplied
   * compareOp and comparator, for usage of which
   * refer to {@link CompareFilter}
   * 
   * @param family dependent column family
   * @param qualifier dependent column qualifier
   * @param dropDependentColumn whether the column should be discarded after
   * @param valueCompareOp comparison operation
   * @param valueComparator comparator
   */
  public DependentColumnFilter(final byte [] family, final byte[] qualifier,
		  final boolean dropDependentColumn, final CompareOp valueCompareOp,
	      final WritableByteArrayComparable valueComparator) 

I see a couple of ways of doing this:

- extend DependentColumnFilter from CompareFilter, and add a CompareOp NO_OP to CompareFilter.CompareOp
(for when you just want all "versions of a row with dependent column"). When gathering "valid"
timestamps doCompare will allow simple discards.
  - This would be practical for future filters that include an optional comparison, so I think
the change would make sense

OR

- include some of CompareFilters code within DependentColumnFilter. Specifically add a new
CompareOp, and a doCompare function
  - I don't really like this approach as it's repeating code, but it avoids further having
to modify outside code.

Does the first approach sound reasonable?

Also, am I being too cautious for what ultimately is a very minor change(as in, should I have
just gone and done it without posting this)?

> Improving filter API to allow for modification of keyvalue list by filter
> -------------------------------------------------------------------------
>
>                 Key: HBASE-2466
>                 URL: https://issues.apache.org/jira/browse/HBASE-2466
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: filters, regionserver
>            Reporter: Juhani Connolly
>            Priority: Minor
>         Attachments: HBASE-2466-2.patch, HBASE-2466-4.patch, HBASE-2466.patch
>
>
> As it stands, the Filter interface allows filtering by
> Filter#filterAllRemaining() -> true indicates scan is over, false, keep going on.
> Filter#filterRowKey(byte[],int,int) -> true to drop this row, if false, we will also
call
> Filter#filterKeyValue(KeyValue) -> true to drop this key/value
> Filter#filterRow() -> last chance to drop entire row based on the sequence of filterValue()
calls. Eg: filter a row if it doesn't contain a specified column.
> It would be useful to allow for an additional API in the form of a step to prune the
list of KeyValues to be sent by implementing an additional
> Filter#filterRow(List<KeyValue>)
> This would allow for a user to write a custom filter against the api that drops unnecessary
KeyValues according to user-defined rules.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message