lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From baris.ka...@oracle.com
Subject Re: how to find out each score contribution from booleanquery components
Date Fri, 28 Jun 2019 13:37:42 GMT
One thing i noticed is that the score is the same for the first 1800 
results, that is not expected, right?

Best regards


On 6/27/19 3:07 PM, baris.kazar@oracle.com wrote:
> the index has "united states" and
>
> still investigating why MAINS does not return MAIN but MAINK etc. does 
> return MAIN first.
>
> Best regards
>
>
>
> On 6/27/19 1:24 PM, baris.kazar@oracle.com wrote:
>> Hi,-
>>
>>  i will check explainOther parameter and see how it is used.
>>
>> i am not sure the problem lies with countryDFLT since other search 
>> cases with other consonants work fine but
>>
>> i will check Your suggestions, thanks.
>>
>> Btw, why is the rest of the query not showing up in the explain plan?
>>
>> Best regards
>>
>>
>>
>> On 6/27/19 11:16 AM, Erick Erickson wrote:
>>> BTW, if you have the ID of the doc you _think_ should be returned
>>> you can see why it wasn’t by using the explainOther parameter.
>>>
>>>> On Jun 27, 2019, at 8:11 AM, András Péteri 
>>>> <apeteri@b2international.com> wrote:
>>>>
>>>> Hi Baris,
>>>>
>>>> Explanation's output is hierarchical, and the leading "0.0" values you
>>>> are seeing are the individual contributions of each boolean clause or
>>>> any other nested query.
>>>>
>>>> Going from bottom to top:
>>>>
>>>> Term query on countryDFLT = 'states', but no term matched this value
>>>> --> score is 0.0 for the term query "countryDFLT:states"
>>>> Term query is wrapped into a 'must' clause, but the term query scored
>>>> 0.0 --> score is 0.0 for the 'must' boolean clause
>>>> "+countryDFLT:states"
>>>> Term query on countryDFLT = 'united', but no term matched this value
>>>> --> score is 0.0 for the term query "countryDFLT:united"
>>>> Term query is wrapped into a 'must' clause, but the term query scored
>>>> 0.0 --> score is 0.0 for the 'must' boolean clause
>>>> "+countryDFLT:united"
>>>> (The two 'should' clauses with boosts have been optimized out; if a
>>>> single 'must' clause is present, they do not need to match at all,
>>>> unless you have minShouldMatch set on the boolean query)
>>>> Boolean query with two 'must' clauses did not match --> score is 0.0
>>>> for the boolean query "+countryDFLT:states +countryDFLT:united
>>>> (countryDFLT:uniten)^0.42000002 (countryDFLT:statesir)^0.56"
>>>>
>>>> ...and so on.
>>>>
>>>> So Atri is correct, the index you are running this query on does not
>>>> seem to have a document where either 'united' or 'states' has been
>>>> indexed for field 'countryDFLT' (let alone both). Do the individual
>>>> building blocks, eg. "countryDFLT:united" return any results?
>>>>
>>>> On Thu, Jun 27, 2019 at 4:33 PM <baris.kazar@oracle.com> wrote:
>>>>> Hi,-
>>>>>
>>>>> Any ideas on what might be happening?
>>>>>
>>>>> maybe i am missing, is there an api to look into each contribution of
>>>>> score into total scrore from the booleanquery?
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>>
>>>>> On 6/26/19 2:29 PM, Baris Kazar wrote:
>>>>>> All must queries (and the rest of course) work ok when i search 
>>>>>> MAINK, MAINL, MAINQ,..., MAINT etc.. for street name
>>>>>> with all consonants except S is used and all other fields are the

>>>>>> same for all queries (NASUA, HILLSBOROUGH, NEW HAMPSHIRE, UNITED

>>>>>> STATES)
>>>>>>
>>>>>> ie., working means: the top result is correct with MAIN.
>>>>>>
>>>>>> But, with street name MAINS, and MAINO (with wovels) i cant get 
>>>>>> MAIN as top result.
>>>>>>
>>>>>> I have two theories:
>>>>>>
>>>>>> either my query plan is too complex to handle MAINS (as there are

>>>>>> some other MAINS street in the index in other cities and states)
>>>>>> so maybe i need to run each component of booleanquery separately

>>>>>> and then manually post process them.
>>>>>>
>>>>>> or my query plan is still not good enough to catch MAIN when i 
>>>>>> search with street MAINS, city NASUA, municipality HILLSBOROUGH,

>>>>>> state NEW HAMPSHIRE, cuntry UNITED STATES
>>>>>> where the first two are fuzzy as they are have errors in them 
>>>>>> and  the rest is phrase query as they are correct
>>>>>>
>>>>>> that is why i want to see each score from each of the component 
>>>>>> of the booleanquery.
>>>>>> so far i checked Lucene but could not find a way to see each 
>>>>>> contributing score to the total score for each result hit document.
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> ----- Original Message -----
>>>>>> From: atri@apache.org
>>>>>> To: java-user@lucene.apache.org
>>>>>> Sent: Wednesday, June 26, 2019 1:09:36 PM GMT -05:00 US/Canada 
>>>>>> Eastern
>>>>>> Subject: Re: how to find out each score contribution from 
>>>>>> booleanquery components
>>>>>>
>>>>>> It seems evident that multiple of your Must clauses are not 
>>>>>> matching any
>>>>>> document, hence no results are being returned?
>>>>>>
>>>>>> On Wed, 26 Jun 2019 at 6:51 PM, <baris.kazar@oracle.com> wrote:
>>>>>>
>>>>>>> Sure, here is the query plan: (i cant run explain plan as it

>>>>>>> does not
>>>>>>> give me anything)
>>>>>>>
>>>>>>> [+streetDFLT:maink~2 (streetDFLT:"maine")^0.35, +cityDFLT:nasua~2
>>>>>>> (cityDFLT:"nasuh")^0.35, ++regionDFLT:"new-hampshire"
>>>>>>> (regionDFLT:"new-hammpshire")^0.98, ++countryDFLT:"united"
>>>>>>> (countryDFLT:"uniten")^0.42000002 +countryDFLT:"states"
>>>>>>> (countryDFLT:"statesir")^0.56]
>>>>>>>
>>>>>>>
>>>>>>> explain plan gives:
>>>>>>>
>>>>>>> Explanation expl = is.explain(booleanQuery.build(), 10);
>>>>>>> System.out.println(expl);
>>>>>>>
>>>>>>> This prints:
>>>>>>>
>>>>>>> 0.0 = Failure to meet condition(s) of required/prohibited clause(s)
>>>>>>>     0.0 = no match on required clause (+regionDFLT:new-hampshire
>>>>>>> (regionDFLT:new-hammpshire)^0.98)
>>>>>>>       0.0 = Failure to meet condition(s) of required/prohibited

>>>>>>> clause(s)
>>>>>>>         0.0 = no match on required clause 
>>>>>>> (regionDFLT:new-hampshire)
>>>>>>>           0.0 = no matching term
>>>>>>>     0.0 = no match on required clause (+countryDFLT:united
>>>>>>> (countryDFLT:uniten)^0.42000002 +countryDFLT:states
>>>>>>> (countryDFLT:statesir)^0.56)
>>>>>>>       0.0 = Failure to meet condition(s) of required/prohibited

>>>>>>> clause(s)
>>>>>>>         0.0 = no match on required clause (countryDFLT:united)
>>>>>>>           0.0 = no matching term
>>>>>>>         0.0 = no match on required clause (countryDFLT:states)
>>>>>>>           0.0 = no matching term
>>>>>>>
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>>
>>>>>>> On 6/26/19 12:48 PM, Atri Sharma wrote:
>>>>>>>> It depends a lot on the actual clauses (whether they are

>>>>>>>> SHOULD, MUST,
>>>>>>>> MUST_NOT), each query’s type (phrase, term etc).
>>>>>>>>
>>>>>>>> Could you post your query and the explain plan of IndexSearcher

>>>>>>>> post the
>>>>>>>> rewrite?
>>>>>>>>
>>>>>>>> On Wed, 26 Jun 2019 at 6:46 PM, <baris.kazar@oracle.com>
wrote:
>>>>>>>>
>>>>>>>>> Hi,-
>>>>>>>>>
>>>>>>>>>     how can one find out each score contribution from

>>>>>>>>> booleanquery
>>>>>>>>> components?
>>>>>>>>>
>>>>>>>>> Best regards
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------

>>>>>>>>>
>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Atri
>>>>>>>> Apache Concerted
>>>>>>>>
>>>>>>> ---------------------------------------------------------------------

>>>>>>>
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>
>>>>>>> -- 
>>>>>> Regards,
>>>>>>
>>>>>> Atri
>>>>>> Apache Concerted
>>>>>>
>>>>>> ---------------------------------------------------------------------

>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>> -- 
>>>> András
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message