lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Sokolov <soko...@ifactory.com>
Subject Re: Text field case sensitivity problem
Date Wed, 15 Jun 2011 20:34:31 GMT
I wonder whether CharFilters are applied to wildcard terms?  I suspect 
they might be.  If that's the case, you could use the MappingCharFilter 
to perform lowercasing (and strip diacritics too if you want that)

-Mike

On 06/15/2011 10:12 AM, Jamie Johnson wrote:
> So simply lower casing the works but can get complex.  The query that 
> I'm executing may have things like ranges which require some words to 
> be upper case (i.e. TO).  I think this would be much better solved on 
> Solrs end, is there a JIRA about this?
>
> On Tue, Jun 14, 2011 at 5:33 PM, Mike Sokolov <sokolov@ifactory.com 
> <mailto:sokolov@ifactory.com>> wrote:
>
>     opps, please s/Highlight/Wildcard/
>
>
>     On 06/14/2011 05:31 PM, Mike Sokolov wrote:
>
>         Wildcard queries aren't analyzed, I think?  I'm not completely
>         sure what the best workaround is here: perhaps simply
>         lowercasing the query terms yourself in the application.  Also
>         - I hope someone more knowledgeable will say that the new
>         HighlightQuery in trunk doesn't have this restriction, but I'm
>         not sure about that.
>
>         -Mike
>
>         On 06/14/2011 05:13 PM, Jamie Johnson wrote:
>
>             Also of interest to me is this returns results
>             http://localhost:8983/solr/select?defType=lucene&q=Person_Name:Kristine
>             <http://localhost:8983/solr/select?defType=lucene&q=Person_Name:Kristine>
>
>
>             On Tue, Jun 14, 2011 at 5:08 PM, Jamie
>             Johnson<jej2003@gmail.com <mailto:jej2003@gmail.com>>  wrote:
>
>                 I am using the following for my text field:
>
>                 <fieldType name="text" class="solr.TextField"
>                 positionIncrementGap="100"
>                 autoGeneratePhraseQueries="true">
>                 <analyzer type="index">
>                 <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>                 <!-- in this example, we will only use synonyms at
>                 query time
>                 <filter class="solr.SynonymFilterFactory"
>                 synonyms="index_synonyms.txt" ignoreCase="true"
>                 expand="false"/>
>                         -->
>                 <!-- Case insensitive stop word removal.
>                           add enablePositionIncrements=true in both
>                 the index and query
>                           analyzers to leave a 'gap' for more accurate
>                 phrase queries.
>                         -->
>                 <filter class="solr.StopFilterFactory"
>                                 ignoreCase="true"
>                                 words="stopwords.txt"
>                                 enablePositionIncrements="true"
>                                 />
>                 <filter class="solr.WordDelimiterFilterFactory"
>                 generateWordParts="1" generateNumberParts="1"
>                 catenateWords="1"
>                 catenateNumbers="1" catenateAll="0"
>                 splitOnCaseChange="1"/>
>                 <filter class="solr.LowerCaseFilterFactory"/>
>                 <filter class="solr.KeywordMarkerFilterFactory"
>                 protected="protwords.txt"/>
>                 <filter class="solr.PorterStemFilterFactory"/>
>                 </analyzer>
>                 <analyzer type="query">
>                 <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>                 <filter class="solr.SynonymFilterFactory"
>                 synonyms="synonyms.txt"
>                 ignoreCase="true" expand="true"/>
>                 <filter class="solr.StopFilterFactory"
>                                 ignoreCase="true"
>                                 words="stopwords.txt"
>                                 enablePositionIncrements="true"
>                                 />
>                 <filter class="solr.WordDelimiterFilterFactory"
>                 generateWordParts="1" generateNumberParts="1"
>                 catenateWords="0"
>                 catenateNumbers="0" catenateAll="0"
>                 splitOnCaseChange="1"/>
>                 <filter class="solr.LowerCaseFilterFactory"/>
>                 <filter class="solr.KeywordMarkerFilterFactory"
>                 protected="protwords.txt"/>
>                 <filter class="solr.PorterStemFilterFactory"/>
>                 </analyzer>
>                 </fieldType>
>
>                 I have a field defined as
>                 <field name="Person_Name" type="text" stored="true"
>                 indexed="true" />
>
>                 when I execute a go to the following url I get results
>                 http://localhost:8983/solr/select?defType=lucene&q=Person_Name:kris*
>                 <http://localhost:8983/solr/select?defType=lucene&q=Person_Name:kris*>
>                 but if I do
>                 http://localhost:8983/solr/select?defType=lucene&q=Person_Name:Kris*
>                 <http://localhost:8983/solr/select?defType=lucene&q=Person_Name:Kris*>
>                 I get nothing.  I thought the LowerCaseFilterFactory
>                 would have handled
>                 lowercasing both the query and what is being indexed,
>                 am I missing
>                 something?
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message