lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roy Lim <royster....@gmail.com>
Subject Re: Multi-word Synonyms - how does sow parameter work?
Date Thu, 16 Aug 2018 22:49:10 GMT
Thanks Andrea for the tip.  I wasn't aware of the autoGeneratePhraseQueries
option for text fields, will definitely keep it in mind.

But I question if this is related to the fix on the query parser which
essentially introduces sow parameter and if false (looks like that is the
default in Solr 7), multiwords should be sent as a 'single input' (see
https://issues.apache.org/jira/browse/LUCENE-2605).  That defect doesn't
make mention of autoGeneratePhraseQueries.

I think this is where my confusion lies: as a non-developer unfortunately
I'm not clear what 'multiwords will be sent as a single input' means,
should it mean that it is treated as a phrase query?  Use AND?  So far as
mentioned I only observe that it is just OR clauses, which is no different
than before the fix.

Thanks again!



On Thu, Aug 16, 2018 at 12:39 AM, Andrea Gazzarini <a.gazzarini@sease.io>
wrote:

> Hi Roy, I think you miss the autoGeneratePhraseQueries=true in the field
> type definition.
> I was on a slightly different use case when I met your same issue (I was
> using synonyms expansion at query time) and honestly I didn't understand
> why this is not the default and implicit behavior. In other words, like
> you, I can't imagine a scenario where I would a multi-terms synonym be
> destructured in multiple OR clauses.
>
> Best,
> Andrea
>
>
> On 16/08/18 02:07, Roy Lim wrote:
>
>> I am not using edismax (eventually I would like to get there) but I'm just
>> testing with standard query right now.  Original posting:
>>
>> I'm trying to figure out why the multi-word synonym expansion is not
>> working correctly (or, at least what I'm misunderstanding).  Specifically,
>> when I test a standard query with Solr Admin it appears to still split on
>> whitespace.
>>
>> Here is my setup:
>> - Solr 7.2.1
>> - synonym example: LCD => liquid crystal display
>> - q=myfield:LCD
>> - added parameter: sow=false
>> - myfield schema looks like (analyzer both applicable to index and query
>> time):
>> ----
>> <fieldType name="myfield" class="solr.TextField"
>> positionIncrementGap="100">
>>    <analyzer>
>>      <tokenizer class="solr.StandardTokenizerFactory" />
>>      <filter class="solr.SynonymGraphFilterFactory" ignoreCase="true"
>> synonyms="synonyms.txt"/>
>>          ...
>> ----
>>
>> When debugging the query, Solr Admin shows the parsed query as:
>> ----
>> myfield:liquid myfield:crystal myfield:display
>> ----
>>
>> (default operator being OR), as you can see it would incorrectly match on
>> any of those words, but not all, which is what I would expect...
>>
>> Should it not do a phrase query search for the exact translated synonym,
>> "liquid crystal display"?
>>
>>
>>
>> On Wed, Aug 15, 2018 at 5:01 PM, Doug Turnbull <
>> dturnbull@opensourceconnections.com> wrote:
>>
>> Also share your fieldType settings for myfield as well from your schema
>>> On Wed, Aug 15, 2018 at 8:00 PM Doug Turnbull <
>>> dturnbull@opensourceconnections.com> wrote:
>>>
>>> Aside from the screenshot issue, one  thing to check: are you searching
>>>> with defType=edismax ?
>>>>
>>>> As in
>>>> q=lcd&qf=myfield&sow=false&defType=edismax
>>>>
>>>> ?
>>>>
>>>> Also sow=false should the the default on Solr 7 and above
>>>>
>>>> Doug
>>>>
>>>> On Wed, Aug 15, 2018 at 6:27 PM Roy Lim <royster.lim@gmail.com> wrote:
>>>>
>>>> I'm trying to figure out why the multi-word synonym expansion is not
>>>>> working
>>>>> correctly.  Specifically, when I test a standard query with Solr Admin
>>>>>
>>>> it
>>>
>>>> is
>>>>> still splitting on whitespace.
>>>>>
>>>>> Here is my setup:
>>>>> - Solr 7.2.1
>>>>> - synonym LCD => liquid crystal display
>>>>> - q=myfield:LCD
>>>>> - added: sow=false
>>>>> - myfield looks like:
>>>>>
>>>>>
>>>>> Solr Admin shows the parsed query looks like:
>>>>>
>>>>> myfield:liquid myfield:crystal myfield:display
>>>>>
>>>>> (default operator being OR), which would incorrectly match documents
>>>>>
>>>> with
>>>
>>>> any of those words, but not all, which is what I would expect...
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>>>>
>>>>> --
>>>> CTO, OpenSource Connections
>>>> Author, Relevant Search
>>>> http://o19s.com/doug
>>>>
>>>> --
>>> CTO, OpenSource Connections
>>> Author, Relevant Search
>>> http://o19s.com/doug
>>>
>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message