lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benson Margulies <bimargul...@gmail.com>
Subject Re: DisjunctionMaxQuery and scoring
Date Fri, 20 Apr 2012 00:42:23 GMT
FWIW, there seems to be an explain bug in 2.9.1 that is fixed in
3.6.0, so I'm no longer confused about the actual behavior.


On Thu, Apr 19, 2012 at 8:32 PM, David Murgatroyd <dmurga@gmail.com> wrote:
> [apologies for the earlier errant send]
>
> I think
>  BooleanQuery bq = new BooleanQuery(false);
> doesn't quite accomplish the desired "name IN (dick, rich)" scoring
> behavior. This is because (name:dick | name:rich) with coord=false would
> score the 'document' "Dick Rich" higher than "Rich" because the former has
> two term matches and the latter only one. In contrast, I think the desire
> is that one and only one of the terms in the document match those in the
> BooleanQuery so that "Rich" would score higher than "Dick Rich", given
> document length normalization. It's almost like a desire for
> BooleanQuery bq = new BooleanQuery(false);
>  bq.set*Maximum*NumberShouldMatch(1);
>
> Is there a good way to accomplish this?
>
> On Thu, Apr 19, 2012 at 7:37 PM, Robert Muir <rcmuir@gmail.com> wrote:
>
>> On Thu, Apr 19, 2012 at 6:36 PM, Benson Margulies <bimargulies@gmail.com>
>> wrote:
>> > I see why I'm so confused, but I think I need to construct a simpler
>> test case.
>> >
>> > My top-level BooleanQuery, which has disableCoord=false, has 22
>> > clauses. All but three are ordinary SHOULD TermQueries. the remainder
>> > are a spanNear and a nested BooleanQuery, and an empty PhraseQuery
>> > (that's a bug).
>> >
>> > However, at the end of the explain trace, I see:
>> >
>> > 0.45 = coord(9/20) I think that my nested Boolean, for which I've been
>> > flipping coord on and off to see what happens, is somehow not
>> > participating at all. So switching it's coord on and off has no
>> > effect.
>> >
>> > Why 20? Why not 22? Is this just an explain quirk?
>>
>> I am not sure (also not sure i understand your example totally), but
>> at the same time could be as simple as the fact you have 2 prohibited
>> (MUST_NOT) clauses. These don't count towards coord()
>>
>> I think its hard to tell from your description (just since it doesn't
>> have all the details). an explain or test case or something like that
>> would might be more efficient if its still not making sense...
>>
>> --
>> lucidimagination.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message