lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Petersen" <rober...@buy.com>
Subject RE: SynonymFilterFactory case changes
Date Wed, 27 Apr 2011 17:42:49 GMT
Yes I did, but that's cool because it is useful to make the final determination explicit here
on the group for the benefit of other users.  :)

Thanks
Robi

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Tuesday, April 26, 2011 5:10 PM
To: solr-user@lucene.apache.org
Subject: Re: SynonymFilterFactory case changes

Ahhh, I mis-read your post..

First, it's not the synonymfilterfactory that's lowercasing anything. The
ingorecase="true" affects the matching, not the output. The output is
probably lowercased because you have it that way in the synonyms.txt
file. At least that's what I just saw using the analysis page from the
Solr admin page.

So yes, if you want the WDF to do anything on tokens put into the input
stream by SynonymFilterFactory, you need to make the
replacement be the accurate case.

But I think you already figured all that out....

Best
Erick

On Tue, Apr 26, 2011 at 7:19 PM, Robert Petersen <robertpe@buy.com> wrote:
> But in this case lowercase is after WDF.  The question is that when you get a hit in
the SynonymFilter on a synonym and where the entries in synonmyms.txt file are all in lower
case do I need to add the case changing versions to make WDF work on case changes because
it appears the synonym text is replaced verbatim by what is in the txt file and so that defeats
the WDF filter.  In fact, adding the case changing versions of this term to the synonyms.txt
file makes this use case work.  (yay)
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Tuesday, April 26, 2011 3:39 PM
> To: solr-user@lucene.apache.org
> Subject: Re: SynonymFilterFactory case changes
>
> Yes, order does matter.  You're right, putting, say, lowercase in front
> of WordDelimiter... will mess up the operations of WDFF.
>
> The admin/analysis page is *extremely* useful for understanding what
> happens in the analysis of input. Make sure to check the "verbose"
> checkbox.
>
> Best
> Erick
>
> On Tue, Apr 26, 2011 at 5:10 PM, Robert Petersen <robertpe@buy.com> wrote:
>> So if there is a hit in the synonym filter factory, do I need to put the
>> various case changes for a term so that the following
>> WordDelimiterFilter analyzer can do its 'split on case changes' work?
>> Here we see SynonymFilterFactory makes all terms lowercase because this
>> is what is in my synonmyms.txt file and I have ignoreCase=true:
>> "macafee, mcafee"
>>
>> Index Analyzer
>> org.apache.solr.analysis.WhitespaceTokenizerFactory {}
>> term position   1
>> term text       McAfee
>> term type       word
>> source start,end        0,6
>> payload
>> org.apache.solr.analysis.SynonymFilterFactory
>> {synonyms=index_synonyms.txt, expand=true, ignoreCase=true}
>> term position   1
>> term text       macafee
>> mcafee
>> term type       word
>> word
>> source start,end        0,6
>> 0,6
>> payload
>>
>>
>

Mime
View raw message