lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Commented: (LUCENE-2358) rename KeywordMarkerTokenFilter
Date Tue, 30 Mar 2010 23:34:27 GMT


Robert Muir commented on LUCENE-2358:

I needed to be able to mark cached tokens as being filler tokens (or not) - a boolean attribute.
After trying to write a new private-use attribute and failing (I didn't make both an interface
and an implementation, I think - I should figure it out and improve the docs I guess), I found
KeywordAttribute and have used it to mark whether or not a cached token is a filler token
(keyword:yes => filler-token:yes).

I'm not really sure the KeywordAttribute is the best fit here, because its purpose is for
the token 
to not be changed by some later filter. I'm not sure how your filter works (I would have to
see the patch),
but I think using this attribute for this purpose could introduce some bugs?

I guess the key is that its not a private-use attribute really, these things are visible by
all tokenstreams.
so stemmers etc will see your 'internal' attribute.

Would it make sense to have a generalized boolean attribute, specialized for keywords or (fill-in-the-blank)?
It's a small leap to say that "iskeyword" means true for whatever boolean attribute you want
to carry, so this isn't really a big deal, but I thought I'd bring it up while you're thinking
about naming this thing.

(This may be a can of worms: if there is a generic boolean attribute, should there be generic
string/int/float/etc. attributes too?)

I don't really think so. Since there can only be one of any attribute in the tokenstream,
you would have
various TokenFilters clashing on how they interpret and use some generic boolean attribute!

> rename KeywordMarkerTokenFilter
> -------------------------------
>                 Key: LUCENE-2358
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Task
>          Components: Analysis
>            Reporter: Robert Muir
>            Priority: Trivial
>         Attachments: LUCENE-2358.patch
> I would like to rename KeywordMarkerTokenFilter to KeywordMarkerFilter.
> We havent released it yet, so its a good time to keep the name brief and consistent.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message