lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hackl, Rene" <Rene.Ha...@FIZ-Karlsruhe.DE>
Subject Re: Tokenizing text custom way
Date Tue, 25 Nov 2003 11:45:34 GMT
Hi Dragan,

> and if I enter 'time' as a search word, I don't want to get "time out" in
> results. I need exact keyword matching. I would achieve this if I tokenize
> "time out" as one token while idexing.

> Maybe someone had similar problem? If someone knows how to handle this,
> please help me.

I've had the same problem. On some fields, I do employ a "NonTokenizer" now,
which looks similar to the other tokenizers except for:

protected boolean isTokenChar(char c) 
  {
    return true;
  }

So "time out" would be one token.

HTH

René

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message