lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Petersen" <rober...@buy.com>
Subject RE: SynonymFilterFactory case changes
Date Tue, 26 Apr 2011 23:19:30 GMT
But in this case lowercase is after WDF.  The question is that when you get a hit in the SynonymFilter
on a synonym and where the entries in synonmyms.txt file are all in lower case do I need to
add the case changing versions to make WDF work on case changes because it appears the synonym
text is replaced verbatim by what is in the txt file and so that defeats the WDF filter. 
In fact, adding the case changing versions of this term to the synonyms.txt file makes this
use case work.  (yay)

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Tuesday, April 26, 2011 3:39 PM
To: solr-user@lucene.apache.org
Subject: Re: SynonymFilterFactory case changes

Yes, order does matter.  You're right, putting, say, lowercase in front
of WordDelimiter... will mess up the operations of WDFF.

The admin/analysis page is *extremely* useful for understanding what
happens in the analysis of input. Make sure to check the "verbose"
checkbox.

Best
Erick

On Tue, Apr 26, 2011 at 5:10 PM, Robert Petersen <robertpe@buy.com> wrote:
> So if there is a hit in the synonym filter factory, do I need to put the
> various case changes for a term so that the following
> WordDelimiterFilter analyzer can do its 'split on case changes' work?
> Here we see SynonymFilterFactory makes all terms lowercase because this
> is what is in my synonmyms.txt file and I have ignoreCase=true:
> "macafee, mcafee"
>
> Index Analyzer
> org.apache.solr.analysis.WhitespaceTokenizerFactory {}
> term position   1
> term text       McAfee
> term type       word
> source start,end        0,6
> payload
> org.apache.solr.analysis.SynonymFilterFactory
> {synonyms=index_synonyms.txt, expand=true, ignoreCase=true}
> term position   1
> term text       macafee
> mcafee
> term type       word
> word
> source start,end        0,6
> 0,6
> payload
>
>

Mime
View raw message