lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@buyways.nl>
Subject RE: Re: Phrase search
Date Mon, 02 Aug 2010 19:54:46 GMT
Hi,

 

Queries on an analyzed field will need to be analyzed as well or it might not match. You can
configure the WordDelimiterFilterFactory so it will not split into multiple tokens because
of numerics, see the splitOnNumerics parameter [1].

 

[1]: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory

 

Cheers,


 
-----Original message-----
From: johnmunir@aol.com
Sent: Mon 02-08-2010 21:29
To: solr-user@lucene.apache.org; 
Subject: Re: Phrase search





Thanks for the quick response.

Which part of my WordDelimiterFilterFactory is changing "Apple 2" to "Apple2"?  How do I
fix it?  Also, I'm really confused about this.  I was under the impression a phrase search
is not impacted by the analyzer, no?

-M


-----Original Message-----
From: Markus Jelsma <markus.jelsma@buyways.nl>
To: solr-user@lucene.apache.org
Sent: Mon, Aug 2, 2010 2:27 pm
Subject: RE: Phrase search


Well, the WordDelimiterFilterFactory in your query analyzer clearly makes "Apple 
" out of "Apple2", that's what it's for. If you're looking for an exact match, 
se a string field. Check the output with the debugQuery=true parameter.

Cheers, 

----Original message-----
rom: johnmunir@aol.com
ent: Mon 02-08-2010 20:18
o: solr-user@lucene.apache.org; 
ubject: Phrase search

i All,
I don't understand why i'm getting this behavior.  I was under the impression if 
search for "Apple 2" (with quotes and space before 2 ) it will give me 
ifferent results vs. if I search for "Apple2" (with quotes and no space before 
), but I'm not!  Why? 
Here is my fieldType setting from my schema.xml:
  <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
   <analyzer type="index">
     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
     <!-- in this example, we will only use synonyms at query time
     <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" 
gnoreCase="true" expand="false"/>
     -->
     <filter class="solr.StopFilterFactory" ignoreCase="true" 
ords="stopwords.txt"/>
     <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" 
enerateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
     <filter class="solr.LowerCaseFilterFactory"/>
     <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
     <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
   </analyzer>
   <analyzer type="query">
     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
     <!-- <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" 
gnoreCase="true" expand="true"/> -->
     <filter class="solr.StopFilterFactory" ignoreCase="true" 
ords="stopwords.txt"/>
     <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" 
enerateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
     <filter class="solr.LowerCaseFilterFactory"/>
     <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
     <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
   </analyzer>
 </fieldType>
What I am missing?!!  What part of my solr.WordDelimiterFilterFactory need to 
hange (if that s where the issue is)?
I m using Solr 1.2
Thanks in advanced.
-M


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message