lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@buyways.nl>
Subject RE: Multi word synomyms
Date Tue, 03 Aug 2010 16:57:29 GMT
Hi,

 

This happens because your tokenizer will generate seperate tokens for `exercise dvds`, so
the SynonymFilter will try to find declared synonyms for `exercise` and `dvds` separately.
It's behavior is documented [1] on the wiki.

 

[1]: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

 

Cheers,
 
-----Original message-----
From: Qwerky <neil.j.taylor@hmv.co.uk>
Sent: Tue 03-08-2010 18:35
To: solr-user@lucene.apache.org; 
Subject: Multi word synomyms


I'm having trouble getting multi word synonyms to work. As an example I have
the following synonym;

exercise dvds => fitness

When I search for exercise dvds I want to return all docs in the index which
contain the keyword fitness. I've read the wiki about
solr.SynonymFilterFactory which recommends expanding the synonym when
indexing, but I'm not sure this is what I want as none of my documents have
the keywords exercise dvds.

Here is the field definition from my schema.xml;





















When I test my search with the analysis page on the admin console it seems
to work fine;

Query Analyzer
org.apache.solr.analysis.WhitespaceTokenizerFactory   {}


term position
12


term text
exercisedvds

term type
wordword

source start,end
0,89,13


payload


org.apache.solr.analysis.SynonymFilterFactory   {ignoreCase=true,
synonyms=synonyms.txt, expand=true}


term position
1

term text
fitness


term type
word

source start,end
0,13

payload


org.apache.solr.analysis.TrimFilterFactory   {}



term position
1

term text
fitness

term type
word


source start,end
0,13

payload


org.apache.solr.analysis.StopFilterFactory   {ignoreCase=true,
enablePositionIncrements=true, words=stopwords.txt}


term position
1


term text
fitness

term type
word

source start,end
0,13

payload



org.apache.solr.analysis.LowerCaseFilterFactory   {}


term position
1

term text
fitness

term type

word

source start,end
0,13

payload


org.apache.solr.analysis.SnowballPorterFilterFactory   {language=English,
protected=protwords.txt}


term position

1

term text
fit

term type
word

source start,end
0,13


payload



...but when I perform the search it doesn't seem to use the
SynonymFilterFactory;



0
0

 exercise dvds
 0

 on
 
 standard
 
 
 2.2
 standard

 on
 *,score
 10

.....

exercise dvds
exercise dvds
PRODUCTKEYWORDS:exercis PRODUCTKEYWORDS:dvds

PRODUCTKEYWORDS:exercis PRODUCTKEYWORDS:dvds


-- 
View this message in context: http://lucene.472066.n3.nabble.com/Multi-word-synomyms-tp1019722p1019722.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message