lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: Lucene's default settings & back compatibility
Date Tue, 19 May 2009 12:28:16 GMT
Michael McCandless wrote:
> On Mon, May 18, 2009 at 11:31 PM, Robert Muir <> wrote:
>> I am curious about this, do you think its a better default because it avoids
>> the max boolean clauses problem? or because for a lot of these scoring
>> doesn't make much sense anyway?
> I think you're referring to constant score mode default, for
> MultiTermQuery & QueryParser, right?
>> I ran tests on a pretty big index, you pay a price for the constant
>> score/filter method. Its slower for the common case searches, it only starts
>> to win for queries that return > 10% or so the index, but its significantly
>> slower for narrow queries...
>> I'm just trying to imagine a case where queries that return > 10% or so of
>> the index are actually the common/default...?
> Excellent points!  And this also makes clear why healthy discussion on
> each default is important, as well as how good it'd be to have
> Settings online so that we are free to even have such discussions
> (vs being bound by back-compat which prevents any improvements
> to the defaults).
> I was actually referring to the fact that scores for MultiTermQuery
> rewritten to BooleanQuery are often meaningless to the app (I
> think?).  But you're right the performance cost of the "make a filter
> up front" approach is too high for smallish queries.
> Thinking more on this... I'd love to have a constant-score mode, but
> implemented as a BooleanQuery, meaning the scores would be the same
> (constant) regardless of whether under-the-hood the query was
> rewritten to BooleanQuery vs pre-compiled up front into a BitSet.
> This would then decouple scoring from rewrite method, which in turn
> would give us the freedom to pick and choose the fastest impl based on
> particulars of the query.
> So if we had such a ConstantScoreBooleanQuery, and we fixed
> MultiTermQuery to conditionally use that, then I think we'd want
> MultiTermQuery to do constant scoring by default.  (And, it'd then be
> free pick whether "create filter up front" or "use
> ConstantScoreBooleanQuery" was most performant, query by query).
> Mike
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:
+1. ConstantScoreQuery is only a performance win when there are lots of 
matches (it seems), but the lack of TooManyClauses exceptions is also a 
big win. I want the best of both worlds :)

- Mark

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message