lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same?
Date Mon, 15 Jul 2013 13:13:56 GMT
Either do a custom highlighter or preprocess the query and generate an "OR" 
of the accented and unaccented terms. Solr has no magic feature to do both. 
Sure, you could do a token filter that duplicated each term and included 
both the accented and unaccented versions, but... it gets messy and is a 
pain with phrases.

It is worth a Jira though.

-- Jack Krupansky

-----Original Message----- 
From: Furkan KAMACI
Sent: Monday, July 15, 2013 9:06 AM
To: solr-user@lucene.apache.org
Subject: How to Indicate Solr That: Both Ascified and Non-Ascii versions of 
tokens are same?

When I search something which has non ASCII characters at Google it returns
me results both original and ascified versions and *highlights both of 
them*.
For example if I search *çiğli* at Google first result is that:

*Çiğli* Belediyesi
www.*cigli*.bel.tr/

How can I do that at Solr? How can I indicate that to Solr: *Both Ascified
and Non-Ascii versions of tokens are same?**
* 


Mime
View raw message