Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 72171 invoked from network); 20 Jul 2009 18:27:32 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 20 Jul 2009 18:27:32 -0000 Received: (qmail 5458 invoked by uid 500); 20 Jul 2009 18:28:37 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 5364 invoked by uid 500); 20 Jul 2009 18:28:37 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 5356 invoked by uid 99); 20 Jul 2009 18:28:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Jul 2009 18:28:37 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Jul 2009 18:28:35 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id CB4CC234C044 for ; Mon, 20 Jul 2009 11:28:14 -0700 (PDT) Message-ID: <1076243938.1248114494790.JavaMail.jira@brutus> Date: Mon, 20 Jul 2009 11:28:14 -0700 (PDT) From: "Michael McCandless (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Updated: (LUCENE-1644) Enable MultiTermQuery's constant score mode to also use BooleanQuery under the hood In-Reply-To: <1799751446.1242738285589.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1644: --------------------------------------- Attachment: LUCENE-1644.patch I'd really like to get this one in for 2.9. The API is still malleable because the constant-score rewrite mode of MultiTermQuery hasn't been released yet. Attached patch, that adds a MultiTermQuery.RewriteMethod parameter, with current values FILTER, SCORING_BOOLEAN_QUERY and CONSTANT_BOOLEAN_QUERY. This replaces the setConstantScoreRewrite(boolean) method. I also added javadocs noting that the two remaining multi-term queries that have not yet switched over to FILTER rewrite method (WildcardQuery and PrefixQuery) will switch over in 3.0. (LUCENE-1557 is already open to make that switch.) > Enable MultiTermQuery's constant score mode to also use BooleanQuery under the hood > ----------------------------------------------------------------------------------- > > Key: LUCENE-1644 > URL: https://issues.apache.org/jira/browse/LUCENE-1644 > Project: Lucene - Java > Issue Type: Improvement > Components: Search > Reporter: Michael McCandless > Priority: Minor > Fix For: 2.9 > > Attachments: LUCENE-1644.patch > > > When MultiTermQuery is used (via one of its subclasses, eg > WildcardQuery, PrefixQuery, FuzzyQuery, etc.), you can ask it to use > "constant score mode", which pre-builds a filter and then wraps that > filter as a ConstantScoreQuery. > If you don't set that, it instead builds a [potentially massive] > BooleanQuery with one SHOULD clause per term. > There are some limitations of this approach: > * The scores returned by the BooleanQuery are often quite > meaningless to the app, so, one should be able to use a > BooleanQuery yet get constant scores back. (Though I vaguely > remember at least one example someone raised where the scores were > useful...). > * The resulting BooleanQuery can easily have too many clauses, > throwing an extremely confusing exception to newish users. > * It'd be better to have the freedom to pick "build filter up front" > vs "build massive BooleanQuery", when constant scoring is enabled, > because they have different performance tradeoffs. > * In constant score mode, an OpenBitSet is always used, yet for > sparse bit sets this does not give good performance. > I think we could address these issues by giving BooleanQuery a > constant score mode, then empower MultiTermQuery (when in constant > score mode) to pick & choose whether to use BooleanQuery vs up-front > filter, and finally empower MultiTermQuery to pick the best (sparse vs > dense) bit set impl. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org