lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernd Fehling <bernd.fehl...@uni-bielefeld.de>
Subject Re: changed query parsing between 4.10.4 and 5.5.3?
Date Thu, 15 Sep 2016 06:28:16 GMT
Your statement "using the old behaviour as a baseline for checking the
correctness of 5.5 behaviour" might be a point of view.

Let me give an example, my query:
q=(text:(star AND trek AND wars)^200 OR text:("star trek wars")^350)
results to 159 hits from 99 million records in the index (version 4.10.4).
I checked all 159 hits, they are correct.

The same query to the same indexed content build with 5.5.3 and also
having 99 million records results in 0 (zero) hits.

What do you think about this result?

By the way, after copying ExtendedDismaxQParser from 4.10.4 to 5.5.3 I get
now 137 hits. I really don't care about the difference, but at least
I get some hits out of 99 million records and they are correct.

Regards,
Bernd


Am 15.09.2016 um 01:41 schrieb Greg Pendlebury:
> I'm sorry that's been your experience Bernd. If you do manage to find some
> time it would be good to see some details on these bugs. It looks at the
> moment as though this is a matter of perception when using the old
> behaviour as a baseline for checking the correctness of 5.5 behaviour.
> 
> Ta,
> Greg
> 
> 
> On 15 September 2016 at 01:27, Erick Erickson <erickerickson@gmail.com>
> wrote:
> 
>> Perhaps https://issues.apache.org/jira/browse/SOLR-8812 and related?
>>
>> Best,
>> Erick
>>
>> On Tue, Sep 13, 2016 at 11:37 PM, Bernd Fehling
>> <bernd.fehling@uni-bielefeld.de> wrote:
>>> Hi Greg,
>>>
>>> after trying several hours with all combinations of parameters and not
>>> getting any useful search result with complex search terms and edismax
>>> I finally copied o.a.s.s.ExtendedDismaxQParser.java from version 4.10.4
>>> to 5.5.3 and did a little modification in o.a.s.u.SolrPluginUtils.java.
>>>
>>> Now it is searching correct and getting logical and valid search results
>>> with any kind of complex search.
>>> Problem solved.
>>>
>>> But still, the edismax, at least of 5.5.3, has some bugs.
>>> If I get time I will look into this but right now my problem is solved
>>> and the customers and users are happy.
>>>
>>> I hope that this buggy edismax version is not used in solr 6.x otherwise
>> you
>>> have the same problems there.
>>>
>>> Regards
>>> Bernd
>>>
>>>
>>> Am 12.09.2016 um 05:10 schrieb Greg Pendlebury:
>>>> Hi Bernd,
>>>>
>>>> "From my point of view the old parsing behavior was correct.
>>>> If searching for a term without operator it is always OR, otherwise
>>>> you can add "+" or "-" to modify that. Now with q.op AND it is
>>>> modified to "+" as a MUST."
>>>>
>>>> It is correct in both cases. q.op dictates (for that query) what default
>>>> operator to use when none is provided, and it is used as a priority over
>>>> the system whole 'defaultOperator'. In either case, if you ask it to use
>>>> OR, it uses it; if you ask it to use AND, it uses it. The behaviour from
>>>> 4.10 that was changed (arguably fixed, although I know that is a
>> debatable
>>>> point) was that you asked it to use AND, and it ignored you
>> (irrespective
>>>> of whether you used defaultOperator or q.op). The are a few subtle
>>>> distinctions that are being missed (like the difference between the
>> boolean
>>>> operators and the OCCURS flags that your are talking about), but they
>> are
>>>> not going to change the outcome.
>>>>
>>>> 8812 related to users who had been historically setting the q.op
>> parameter
>>>> to influence the downstream default selection of 'mm' (If you don't
>> provide
>>>> 'mm' it is set for you based on 'q.op') instead of directly setting the
>>>> 'mm' value themselves. But again in this case, you're setting 'mm'
>> anyway,
>>>> so it shouldn't be relevant.
>>>>
>>>> Ta,
>>>> Greg
>>>>
>>>> On 9 September 2016 at 16:44, Bernd Fehling <
>> bernd.fehling@uni-bielefeld.de>
>>>> wrote:
>>>>
>>>>> Hi Greg,
>>>>>
>>>>> thanks a lot, thats it.
>>>>> After setting q.op to OR it works _nearly_ as before with 4.10.4.
>>>>>
>>>>> But how stupid this?
>>>>> I have in my schema <solrQueryParser defaultOperator="AND"/>
>>>>> and also had q.op to AND to make sure my default _is_ AND,
>>>>> meant as conjunction between terms.
>>>>> But now I have q.op to OR and defaultOperator in schema to AND
>>>>> to just get _nearly_ my old behavior back.
>>>>>
>>>>> schema has following comment:
>>>>> "... The default is OR, which is generally assumed so it is
>>>>> not a good idea to change it globally here.  The "q.op" request
>>>>> parameter takes precedence over this. ..."
>>>>>
>>>>> What I don't understand is why they change some major internals
>>>>> and don't give any notice about how to keep old parsing behavior.
>>>>>
>>>>> From my point of view the old parsing behavior was correct.
>>>>> If searching for a term without operator it is always OR, otherwise
>>>>> you can add "+" or "-" to modify that. Now with q.op AND it is
>>>>> modified to "+" as a MUST.
>>>>>
>>>>> I still get some differences in search results between 4.10.4 and
>> 5.5.3.
>>>>> What other side effects has this change of q.op from AND to OR in
>>>>> other parts of query handling, parsing and searching?
>>>>>
>>>>> Regards
>>>>> Bernd
>>>>>
>>>>> Am 09.09.2016 um 05:43 schrieb Greg Pendlebury:
>>>>>> I forgot to mention the tickets:
>>>>>> SOLR-2649 and SOLR-8812
>>>>>>
>>>>>> On 9 September 2016 at 13:38, Greg Pendlebury <
>> greg.pendlebury@gmail.com
>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>> Under 4.10 q.op was ignored by the edismax parser and always
forced
>> to
>>>>> OR.
>>>>>>> 5.5 is looking at the q.op=AND you requested.
>>>>>>>
>>>>>>> There are also some changes to the default values selected for
mm,
>> but I
>>>>>>> doubt those apply here since you are setting it explicitly.
>>>>>>>
>>>>>>> On 8 September 2016 at 00:35, Mikhail Khludnev <mkhl@apache.org>
>> wrote:
>>>>>>>
>>>>>>>> I suppose
>>>>>>>>    <str name="parsedquery_toString">+((text:star
>> text:trek)~2)</str>
>>>>>>>> and
>>>>>>>>   <str name="parsedquery_toString">+(+text:star +text:trek)</str>
>>>>>>>> are equal. mm=2 is equal to +foo +bar
>>>>>>>>
>>>>>>>> On Wed, Sep 7, 2016 at 10:52 AM, Bernd Fehling <
>>>>>>>> bernd.fehling@uni-bielefeld.de> wrote:
>>>>>>>>
>>>>>>>>> Hi list,
>>>>>>>>>
>>>>>>>>> while going from SOLR 4.10.4 to 5.5.3 I noticed a change
in query
>>>>>>>> parsing.
>>>>>>>>> 4.10.4
>>>>>>>>> <str name="rawquerystring">text:star text:trek</str>
>>>>>>>>>   <str name="querystring">text:star text:trek</str>
>>>>>>>>>   <str name="parsedquery">(+((text:star
>> text:trek)~2))/no_coord</str>
>>>>>>>>>   <str name="parsedquery_toString">+((text:star
>> text:trek)~2)</str>
>>>>>>>>>
>>>>>>>>> 5.5.3
>>>>>>>>> <str name="rawquerystring">text:star text:trek</str>
>>>>>>>>>   <str name="querystring">text:star text:trek</str>
>>>>>>>>>   <str name="parsedquery">(+(+text:star
>> +text:trek))/no_coord</str>
>>>>>>>>>   <str name="parsedquery_toString">+(+text:star
+text:trek)</str>
>>>>>>>>>
>>>>>>>>> There are very many new features and changes between
this two
>>>>> versions.
>>>>>>>>> It looks like a change in query parsing.
>>>>>>>>> Can someone point me to the solr or lucene jira about
the changes?
>>>>>>>>> Or even give a hint how to get my "old" query parsing
back?
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Bernd
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Sincerely yours
>>>>>>>> Mikhail Khludnev
>>>>>>>>
>>>>>
>>>>
>>
> 

Mime
View raw message