hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: Issue with column-counting filters accepting multiple versions of a column
Date Fri, 05 Oct 2012 05:43:53 GMT
Filters are applied before the version counting is performed.
This is a frequent area of contention. If filters were applied after the version counting
other folks would complain (and have complained - in the early days filter were in fact evaluated
after the version counting - which is why it was changed) for other reasons.

Unless we allow a filter to declare whether it needs be run before or after the version counting,
we will always have an unhappy party :(
(I started thinking about this in HBASE-5257 but abandoned that for lack of interest)

-- Lars

 From: Andrew Olson <noslowerdna@gmail.com>
To: user@hbase.apache.org 
Sent: Thursday, October 4, 2012 1:33 PM
Subject: Issue with column-counting filters accepting multiple versions of a column
It looks like the max version limit for a table or scanner is not applied
to disregard older versions, prior to counting columns within a
ColumnPaginationFilter or ColumnCountGetFilter. As a result, a Scan or Get
can ultimately retrieve fewer than the requested number of columns when
there is a sufficient number of existing columns to satisfy the request, if
multiple versions of a column have been added to a row.

A minimal test case demonstrating this behavior can be found here:

The javadoc for Get mentions 'Only Filter.filterKeyValue(KeyValue) is
called AFTER all tests for ttl, column match, deletes and *max
versions*have been run.'; for these two filters this behavior does not
appear to be
true, as flattening of multiple versions appears to occur after the filter
has been applied.

Should this be considered a bug? If so, are there any possible workarounds
besides implementing and deploying a custom Filter class?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message