lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amar Raja <amar.r...@thecommercepartnership.com>
Subject SynonymGraphFilterFactory with edismax
Date Thu, 02 Nov 2017 12:31:53 GMT
Hello,

I have the following field definition:

<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_en.txt" />
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.EnglishPossessiveFilterFactory"/>
    <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
    <filter class="solr.PorterStemFilterFactory"/>
  </analyzer>
</fieldType>

And the following two synonym definitions:

kids => boys,girls
metallic => rose gold,metallic

The intent being a user searching for "kids" should get girls or boys
results, but searching for "boys" will not bring back girls results.
Similarly searching for "metallic" should bring back results for either
"metallic" or "rose gold", but the search for "rose gold" should not bring
back "metallic".

Another property I have set is q.op=AND. I.e. "boys tops" should return
where only both terms exist.

The first synonym works well, producing the following dismax query:

(+(+DisjunctionMaxQuery((Synonym(web_name:boi
web_name:girl))~1.0)))/no_coord

However, for the second I get this:

(+(+DisjunctionMaxQuery(((((+web_name:rose +web_name:gold)
web_name:metal)~2))~1.0)))/no_coord

But for any terms where any of the terms in the RHS have multiple terms, it
seems to want to match both synonyms, so in this case only documents with
both "metallic" and "rose gold" will match.

Any ideas where I am going wrong?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message