lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Rowe <sar...@gmail.com>
Subject Re: Solr mm is field Level in case sow is false
Date Tue, 28 Nov 2017 14:32:30 GMT
Hi Aman,

From the last bullet in the “Caveats and remaining issues” section of m query-time multi-word
synonyms blog: <https://lucidworks.com/2017/04/18/multi-word-synonyms-solr-adds-query-time-support/>,
in part:

> sow=false changes the queries edismax produces over multiple fields when
> any of the fields’ query-time analysis differs from the other fields’ [...]
> This can change results in general, but quite significantly when combined
> with the mm (min-should-match) request parameter: since min-should-match
> applies per field instead of per term, missing terms in one field’s analysis
> won’t disqualify docs from matching.

One effective way of addressing this issue is to make all queried fields use the same analysis,
e.g. by copy-fielding the subset of fields that are different into ones that are the same,
and then querying against the target fields instead.

--
Steve
www.lucidworks.com

> On Nov 28, 2017, at 5:25 AM, Aman Deep singh <amandeep.cool99@gmail.com> wrote:
> 
> Hi,
> When sow is set to false then solr query is generated a little differently as compared
to sow=true
> 
> Solr version -6.6.1
> 
> User query -Asus ZenFone Go ZB5 Smartphone
> mm is set to 100%
> qf=nameSearch^7 brandSearch
> 
> field definition
> 
> 1. nameSearch—
> <fieldType name="text_word_delimiter" class="solr.TextField" autoGeneratePhraseQueries="false"
positionIncrementGap="100">
>    <analyzer type="index">
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.PatternReplaceFilterFactory" pattern="&amp;" replacement="and"/>
>        <filter class="solr.PatternReplaceFilterFactory" pattern="[^\dA-Za-z ]" replacement="
"/>
>        <filter class="solr.WordDelimiterFilterFactory" catenateNumbers="1" generateNumberParts="1"
splitOnCaseChange="1" generateWordParts="1" preserveOriginal="1" catenateAll="1" catenateWords="1"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>    </analyzer>
>    <analyzer type="query">
>        <tokenizer class="solr.StandardTokenizerFactory"/>
>        <filter class="solr.PatternReplaceFilterFactory" pattern="&amp;" replacement="and"/>
>        <filter class="solr.PatternReplaceFilterFactory" pattern="[^\dA-Za-z ]" replacement="
"/>
>        <filter class=“solr.ManagedSynonymGraphFilterFactory" synonyms=“synonyms.txt"/>
>        <filter class="solr.WordDelimiterFilterFactory" catenateNumbers="0" generateNumberParts="1"
splitOnCaseChange="1" generateWordParts="1" splitOnNumerics="1" preserveOriginal="0" catenateAll="0"
catenateWords="0"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>    </analyzer>
> </fieldType>
> 
> 2. brandSearch
> <fieldType name="text_exact" class="solr.TextField" autoGeneratePhraseQueries="true"
positionIncrementGap="100">
>    <analyzer type="index">
>        <tokenizer class="solr.StandardTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>    </analyzer>
>    <analyzer type="query">
>        <tokenizer class="solr.StandardTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>    </analyzer>
> </fieldType>
> 
> 
> with sow=false
> "parsedquery":"(+DisjunctionMaxQuery((((brandSearch:asus brandSearch:zenfone brandSearch:go
brandSearch:zb5 brandSearch:smartphone)~5) | ((nameSearch:asus nameSearch:zen nameSearch:fone
nameSearch:go nameSearch:zb nameSearch:5 nameSearch:smartphone)~7)^7.0)))/no_coord",
> 
> with sow=true
> "parsedquery":"(+(DisjunctionMaxQuery((brandSearch:asus | (nameSearch:asus)^7.0)) DisjunctionMaxQuery((brandSearch:zenfone
| ((nameSearch:zen nameSearch:fone)~2)^7.0)) DisjunctionMaxQuery((brandSearch:go | (nameSearch:go)^7.0))
DisjunctionMaxQuery((brandSearch:zb5 | ((nameSearch:zb nameSearch:5)~2)^7.0)) DisjunctionMaxQuery((brandSearch:smartphone
| (nameSearch:smartphone)^7.0)))~5)/no_coord",
> 
> 
> 
> If you see the difference in parsed query in sow=false case mm is working as field level
while in case of sow=true mm is working across the field
> 
> We need to use sow=false as it is the only way to use multiword synonyms
> Any idea why it is behaving in this manner and any way to fix so that mm will work across
fields in qf.
> 
> Thanks,
> Aman Deep Singh


Mime
View raw message