lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: pieces missing in reusable analyzers?
Date Mon, 10 Aug 2009 22:14:58 GMT
you can only call reset(Reader) on a Tokenizer, not any TokenStream.

this is why there is the SavedStreams mess in Standard/Stop core
analyzers and in every analyzer in LUCENE-1794...

On Mon, Aug 10, 2009 at 6:10 PM, Yonik Seeley<yonik@lucidimagination.com> wrote:
> I had thought that implementing reusable analyzers in solr was going
> to be cake... but either I'm missing something, or Lucene is missing
> something.
>
> Here's the way that one used to create custom analyzers:
>
> class CustomAnalyzer extends Analyzer {
>  public TokenStream tokenStream(String fieldName, Reader reader) {
>    return new LowerCaseFilter(new NGramTokenFilter(new
> StandardTokenizer(reader)));
>  }
> }
>
>
> Now let's try to make this reusable:
>
> class CustomAnalyzer2 extends Analyzer {
>  public TokenStream tokenStream(String fieldName, Reader reader) {
>    return new LowerCaseFilter(new NGramTokenFilter(new
> StandardTokenizer(reader)));
>  }
>
>  @Override
>  public TokenStream reusableTokenStream(String fieldName, Reader
> reader) throws IOException {
>    TokenStream ts = getPreviousTokenStream();
>    if (ts == null) {
>      ts = tokenStream(fieldName, reader);
>      setPreviousTokenStream(ts);
>      return ts;
>    } else {
>      // uh... how do I reset a token stream?
>      return ts;
>    }
>  }
> }
>
>
> See the missing piece?  Seems like TokenStream needs a reset(Reader r)
> method or something?
>
> -Yonik
> http://www.lucidimagination.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>



-- 
Robert Muir
rcmuir@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message