lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chitra <chithu.r...@gmail.com>
Subject Re: Accent insensitive search for greek characters
Date Tue, 24 Oct 2017 12:05:42 GMT
Hi Alexandre,
                   ICUTransformFilter is working fine for greek characters
alone as per requirement. but one case it's breaking( σ & ς are the lower
forms of Σ Sigma).

*Example:*

I indexed the terms πελάτης (indexed as πελατης) & πελάτηΣ (indexed
as
πελατης).

I get the expected search results if I perform the search for πελάτηΣ (or)
πελάτης (or) any combinations of upper case & lower case Greek characters.
But if I search as πελατησ I won't get any search results.

In Greek, σ & ς are the lower forms of Σ Sigma. And this case is solved in
ICUFoldingFilter.


Is ICU Transliterator rule formed right? Kindly look at the below code


TokenStream tok = new ICUTransformFilter(tok,
> Transliterator.getInstance("Greek; Lower; NFD; [:Nonspacing Mark:] Remove;
> NFC;"));



Kindly help me to resolve this.


Regards,
Chitra

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message