lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Handling wildcard search containing special characters (unicode)
Date Wed, 20 Apr 2011 20:21:22 GMT

: Facing a Solr issue, I have been told that queries with a term like:
: Kiinteistösih*
: will not match the Finnish word "Kiinteistösihteeri" and that it's a
: known limitation of Lucene.

that is a missleading statement -- that types of query *can* match that 
word in an document, if the schema is configured in a way to preserve that 
raw term.

where people run into trouble is if they use stemming, or loewrcasing, or 
ascii foldering, or any other forms of analysis at indexing time, because 
at query time the query parser does not use analysis for prefix and 
wildcard searches  (if it did a search for something like "dogs*" might 
stem to "dog* which is not what the user asked for)


PS...

http://people.apache.org/~hossman/#solr-user
Please Use "solr-user@lucene" Not "dev@lucene"

Your question is better suited for the solr-user@lucene mailing list ...
not the dev@lucene list.  The dev list is for discussing development of
the internals of Solr and the Lucene Java library ... it is *not* the 
appropriate place to ask questions about how to use Solr or the Lucene 
Java library when developing your own applications.  



-Hoss

Mime
View raw message