lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Phrase search using quotes -- special Tokenizer
Date Wed, 06 Sep 2006 01:40:27 GMT

: Sorry for the confusion and thanks for taking the time to educate me.  So, if
: I am just indexing literal values, what is the best way to do that (what
: analyzer)?  Sounds like this approach, even though it works, is not the
: preferred method.

if you truely want just the literal values then KeywordAnalyzer will work
great -- but you mentioned before that you want something more complicated
(case normalization i believe?) ... for something like that (lowercasing,
but preserving whitespace and punctuation) you'll need to write a custom
Analyzer ... that's not hard though, just glue together the
KeywordTokenizer with the LowerCaseFilter ala...

  public TokenStream tokenStream(String fieldName, Reader reader) {
    return new LowerCaseFilter(new KeywordTokenizer(reader));

...if there are other special rules you want, then put them in other
filters and compose your Analyzer further.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message