lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1644) Enable MultiTermQuery's constant score mode to also use BooleanQuery under the hood
Date Thu, 23 Jul 2009 12:03:14 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734570#action_12734570
] 

Uwe Schindler commented on LUCENE-1644:
---------------------------------------

Looks good, Mike.

I think NumericRangeQuery should also be swiched to auto mode, you are right. My perf test
was a little bit unfair, because it used a 5 Mio index with random integers. The queries were
also random and the sum of docs/index size was about 1/3 (because of the random query). So
most quries hit abou one third of all docs. In this case, always the filter is faster. For
very small ranges with few terms, it may be really good to use 

A good thing would also be to set the mode to filter automatically, if precisionStep >6
for longs (valSize=64) and precStep > 8 for ints (valSize=32), because here the number
of terms is often too big.

One bug in ConstantScoreRangeQuery: You set the default to AUTO, the method to prevent changing
this is wrong:
{code:java}
   /** Changes of mode are not supported by this class (fixed to constant score rewrite mode)
*/
-  public void setConstantScoreRewrite(boolean constantScoreRewrite) {
-    if (!constantScoreRewrite)
-      throw new UnsupportedOperationException("Use TermRangeQuery instead to enable boolean
query rewrite.");
+  public void setRewriteMethod(RewriteMethod method) {
+    if (method != CONSTANT_SCORE_FILTER_REWRITE) {
+      throw new UnsupportedOperationException("Use TermRangeQuery instead to change the rewrite
method.");
+    }
   }
{code}
I would change this to simply always throw UOE on any change in ConstantScoreRangeQuery.

> Enable MultiTermQuery's constant score mode to also use BooleanQuery under the hood
> -----------------------------------------------------------------------------------
>
>                 Key: LUCENE-1644
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1644
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1644.patch, LUCENE-1644.patch, LUCENE-1644.patch, LUCENE-1644.patch,
LUCENE-1644.patch
>
>
> When MultiTermQuery is used (via one of its subclasses, eg
> WildcardQuery, PrefixQuery, FuzzyQuery, etc.), you can ask it to use
> "constant score mode", which pre-builds a filter and then wraps that
> filter as a ConstantScoreQuery.
> If you don't set that, it instead builds a [potentially massive]
> BooleanQuery with one SHOULD clause per term.
> There are some limitations of this approach:
>   * The scores returned by the BooleanQuery are often quite
>     meaningless to the app, so, one should be able to use a
>     BooleanQuery yet get constant scores back.  (Though I vaguely
>     remember at least one example someone raised where the scores were
>     useful...).
>   * The resulting BooleanQuery can easily have too many clauses,
>     throwing an extremely confusing exception to newish users.
>   * It'd be better to have the freedom to pick "build filter up front"
>     vs "build massive BooleanQuery", when constant scoring is enabled,
>     because they have different performance tradeoffs.
>   * In constant score mode, an OpenBitSet is always used, yet for
>     sparse bit sets this does not give good performance.
> I think we could address these issues by giving BooleanQuery a
> constant score mode, then empower MultiTermQuery (when in constant
> score mode) to pick & choose whether to use BooleanQuery vs up-front
> filter, and finally empower MultiTermQuery to pick the best (sparse vs
> dense) bit set impl.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message