lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Phrase search using quotes -- special Tokenizer
Date Wed, 06 Sep 2006 01:40:27 GMT

: Sorry for the confusion and thanks for taking the time to educate me.  So, if
: I am just indexing literal values, what is the best way to do that (what
: analyzer)?  Sounds like this approach, even though it works, is not the
: preferred method.

if you truely want just the literal values then KeywordAnalyzer will work
great -- but you mentioned before that you want something more complicated
(case normalization i believe?) ... for something like that (lowercasing,
but preserving whitespace and punctuation) you'll need to write a custom
Analyzer ... that's not hard though, just glue together the
KeywordTokenizer with the LowerCaseFilter ala...

  public TokenStream tokenStream(String fieldName, Reader reader) {
    return new LowerCaseFilter(new KeywordTokenizer(reader));
  }

...if there are other special rules you want, then put them in other
filters and compose your Analyzer further.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message