lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: [jira] Issue Comment Edited: (LUCENE-1644) Enable MultiTermQuery's constant score mode to also use BooleanQuery under the hood
Date Tue, 21 Jul 2009 16:50:59 GMT
It would be great to get some repeatable tests for this type of thing into
the benchmark contrib. I had started work on that sometime back, but I don't
think I have it around anymore.

On Tue, Jul 21, 2009 at 12:14 PM, Robert Muir (JIRA) <jira@apache.org>wrote:

>
>    [
> https://issues.apache.org/jira/browse/LUCENE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733676#action_12733676]
>
> Robert Muir edited comment on LUCENE-1644 at 7/21/09 9:14 AM:
> --------------------------------------------------------------
>
> Mike, I am afraid that might hurt some people's performance.
> I'm a bit concerned my index/queries are maybe abnormal and don't want to
> break the general case.
>
> I'm not too familiar with trie [what it would do with a really general
> range query], but a simpler example would be no stopwords, wildcard query of
> "th?" (matching "the")
> maybe it only matches one term, but that term is very common / dense bitset
> and probably "hot".
>
> In this case the filter would be better, even though its 1 term.
>
>      was (Author: rcmuir):
>    Mike, I am afraid that might hurt some people's performance.
> I'm a bit concerned my index/queries are maybe abnormal and don't want to
> break the general case.
>
> I'm not too familiar with trie [what it would do with a really general
> range query], but a simpler example would be no stopwords, wildcard query of
> th*
> maybe it only matches one term, but that term is very common / dense bitset
> and probably "hot".
>
> In this case the filter would be better, even though its 1 term.
>
> > Enable MultiTermQuery's constant score mode to also use BooleanQuery
> under the hood
> >
> -----------------------------------------------------------------------------------
> >
> >                 Key: LUCENE-1644
> >                 URL: https://issues.apache.org/jira/browse/LUCENE-1644
> >             Project: Lucene - Java
> >          Issue Type: Improvement
> >          Components: Search
> >            Reporter: Michael McCandless
> >            Assignee: Michael McCandless
> >            Priority: Minor
> >             Fix For: 2.9
> >
> >         Attachments: LUCENE-1644.patch
> >
> >
> > When MultiTermQuery is used (via one of its subclasses, eg
> > WildcardQuery, PrefixQuery, FuzzyQuery, etc.), you can ask it to use
> > "constant score mode", which pre-builds a filter and then wraps that
> > filter as a ConstantScoreQuery.
> > If you don't set that, it instead builds a [potentially massive]
> > BooleanQuery with one SHOULD clause per term.
> > There are some limitations of this approach:
> >   * The scores returned by the BooleanQuery are often quite
> >     meaningless to the app, so, one should be able to use a
> >     BooleanQuery yet get constant scores back.  (Though I vaguely
> >     remember at least one example someone raised where the scores were
> >     useful...).
> >   * The resulting BooleanQuery can easily have too many clauses,
> >     throwing an extremely confusing exception to newish users.
> >   * It'd be better to have the freedom to pick "build filter up front"
> >     vs "build massive BooleanQuery", when constant scoring is enabled,
> >     because they have different performance tradeoffs.
> >   * In constant score mode, an OpenBitSet is always used, yet for
> >     sparse bit sets this does not give good performance.
> > I think we could address these issues by giving BooleanQuery a
> > constant score mode, then empower MultiTermQuery (when in constant
> > score mode) to pick & choose whether to use BooleanQuery vs up-front
> > filter, and finally empower MultiTermQuery to pick the best (sparse vs
> > dense) bit set impl.
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>


-- 
-- 
- Mark

http://www.lucidimagination.com

Mime
View raw message