accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Reichman <>
Subject Re: Filtering on column qualifier
Date Thu, 22 Aug 2013 16:33:12 GMT
I haven't considered that. Would that allow me to specify it in the
client-side code and not worry about spreading JARs around? It is a very
basic need, in my scan iterator loop right now is:

            String matchScoreString = key.getColumnQualifier().toString();
            Double score = Double.parseDouble(matchScoreString);

            if (threshold != null && threshold > score) {
                // TODO: figure out if this is possible to do via
data-local scan iterator

What is the pattern for including a groovy snippet for a scan iterator?

On Thu, Aug 22, 2013 at 11:16 AM, David Medinets

> Have you thought of writing a filter class that takes some bit of groovy
> for execution inside the accept method, depending on how efficient you need
> to be and how changeable your constraints are.
> On Thu, Aug 22, 2013 at 10:19 AM, Marc Reichman <
>> wrote:
>> Extending looked like a bit of a boondoggle, because all of the useful
>> fields in the class are private, not protected. I also ran into another
>> architectural question, how does one pass a value (a-la constructor) into
>> one of these classes? If I'm going to use this to filter based on a
>> threshold, I'd need to pass that threshold in somehow.
>> On Wed, Aug 21, 2013 at 9:49 AM, John Vines <> wrote:
>>> There's no way to extend the ColumnQualietyFilter via configuration, but
>>> it sounds like you are on top of it. You just need to extend the class,
>>> possibly copy a bit of code, and change the equality check to a compareTo
>>> after converting the Strings to Doubles.
>>> On Wed, Aug 21, 2013 at 10:00 AM, Marc Reichman <
>>>> wrote:
>>>> I have some data stored in Accumulo with some scores stored as column
>>>> qualifiers (there was an older thread about this). I would like to find a
>>>> way to do thresholding when retrieving the data without retrieving it all
>>>> and then manually filtering out items below my threshold.
>>>> I know I can "fetch" column qualifiers which are exact.
>>>> I've seen the ColumnQualifierFilter, which I assume is what's in play
>>>> when I fetch qualifiers. Is there a reasonable pattern to extend this and
>>>> try to use it as a scan iterator so I can do things like "greater than" a
>>>> value which will be interpreted as a Double vs. the string equality going
>>>> on now?
>>>> Thanks,
>>>> Marc

View raw message