lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From AHMET ARSLAN <iori...@yahoo.com>
Subject Re: LowerCaseFilter fails one letter (I) of Turkish alphabet
Date Mon, 30 Nov 2009 19:22:58 GMT
> just to clarify, GreekLowerCaseFilter really shouldn't
> exist either. The
> final sigma problem it has (where there are two lowercase
> forms depending
> upon position in word), this is also solved with unicode
> case folding or
> collation. This is a perfect example of how lowercase is
> the wrong operation
> for search.
> 
> and RussianLowerCaseFilter is deprecated now, it does the
> exact same thing
> as LowerCaseFilter.

Thank you for your explanations. I just read the java-doc of org.apache.lucene.collation.
If I am not wrong it is better to remove lowercasefilter completely from analyzer chain and
add CollationKeyFilter with appropriate Locale right after the Tokenizer. Just as in CollationKeyAnalyzer.


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message