lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Can I omit ShingleFilter's filler tokens
Date Thu, 12 May 2011 21:34:15 GMT
> we already did this in 3.1 by making a base FilteringTokenFilter class?
> a regex filter is trivial if you subclass this (we could add something like this
> untested code to the .pattern package or whatever)
> 
> public class PatternRemoveFilter extends FilteringTokenFilter {
>   private final Matcher matcher;
>   private final CharTermAttribute termAtt =
> addAttribute(CharTermAttribute.class);
> 
>   public PatternRemoveFilter(boolean enablePositionIncrements,
> TokenStream input, Pattern pattern) {
>     super(enablePositionIncrements, input);
>     matcher = pattern.matcher(termAtt);
>   }
> 
>   @Override
>   protected boolean accept() throws IOException {
>     matcher.reset();
>     return !matcher.matches();
>   }
> }

I *love* CharTermAttribute and this elegant code, you can use it 1:1 for this, we did a good
work with adding that as replacement to ancient TermAttribute in 3.1! Robert that’s fabulous
:-)

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message