lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thorsten Scherler <thorsten.scherler....@juntadeandalucia.es>
Subject Re: Strange behavior when searching with accents
Date Thu, 20 Sep 2007 12:12:40 GMT
On Thu, 2007-09-20 at 14:01 +0200, Thierry Collogne wrote:
> I have entered the the matthé term in the the analyzer, but as far as I
> understand, it should be ok. I have made some screenshots with the results.
> 
> http://farm2.static.flickr.com/1407/1412619772_0b697789cd_o.jpg
> 
> http://farm2.static.flickr.com/1245/1412619774_3351b287bc_o.jpg
> 
> I find it strange that the second screenshost doesn"t give any matches.
> 
> Can someone take a look at them and perhaps clarify why it does not work?

See my other response, but the 2nd screenshoot has changed the the
"query" field using the non accent way.

Further you want to use the "verbose output" option to better analyze.

salu2

> 
> Thank you.
> 
> 
> On 20/09/2007, Thierry Collogne < thierry.collogne@gmail.com> wrote:
> >
> > We are using this schema definition
> >
> > <fieldType name="text" class=" solr.TextField" positionIncrementGap="100">
> >       <analyzer type="index">
> >         <tokenizer class=" solr.WhitespaceTokenizerFactory"/>
> >         <!-- in this example, we will only use synonyms at query time
> >         <filter class="solr.SynonymFilterFactory"
> > synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
> >         -->
> >         <filter class="solr.StopFilterFactory" ignoreCase="true" words="
> > stopwords.txt"/>
> >         <filter class="solr.WordDelimiterFilterFactory"
> > generateWordParts="1" generateNumberParts="1" catenateWords="1"
> > catenateNumbers="1" catenateAll="0"/>
> >         <filter class="solr.LowerCaseFilterFactory"/>
> >         <filter class="solr.EnglishPorterFilterFactory" protected="
> > protwords.txt"/>
> >         <filter class=" solr.RemoveDuplicatesTokenFilterFactory"/>
> >         <filter class="solr.ISOLatin1AccentFilterFactory"/>
> >       </analyzer>
> >       <analyzer type="query">
> >         <tokenizer class=" solr.WhitespaceTokenizerFactory"/>
> >         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> > ignoreCase="true" expand="true"/>
> >         <filter class=" solr.StopFilterFactory" ignoreCase="true" words="
> > stopwords.txt"/>
> >         <filter class="solr.WordDelimiterFilterFactory"
> > generateWordParts="1" generateNumberParts="1" catenateWords="0"
> > catenateNumbers="0" catenateAll="0"/>
> >         <filter class="solr.LowerCaseFilterFactory"/>
> >         <filter class="solr.EnglishPorterFilterFactory" protected="
> > protwords.txt"/>
> >         <filter class=" solr.RemoveDuplicatesTokenFilterFactory"/>
> >         <filter class="solr.ISOLatin1AccentFilterFactory"/>
> >       </analyzer>
> >     </fieldType>
> >
> > I will take a look at the analyzer took.
> >
> > Thank you both for the quick response.
> >
> > On 20/09/2007, Bertrand Delacretaz < bdelacretaz@apache.org > wrote:
> > >
> > > On 9/20/07, Thierry Collogne < thierry.collogne@gmail.com> wrote:
> > >
> > > > ..when we search for "matthé" or for "matthe", we get two totally
> > > > different results....
> > >
> > > The analyzer admin tool should help you find out what's happening, see
> > > http://wiki.apache.org/solr/FAQ#head-b25df8c8393bbcca28f1f344c432975002e29ca9
> > >
> > >
> > > -Bertrand
> > >
> >
> >
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions


Mime
View raw message