Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 34968 invoked from network); 22 Jul 2009 20:57:05 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 Jul 2009 20:57:05 -0000 Received: (qmail 93924 invoked by uid 500); 22 Jul 2009 20:43:28 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 93456 invoked by uid 500); 22 Jul 2009 20:43:27 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 92543 invoked by uid 99); 22 Jul 2009 20:37:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Jul 2009 20:37:37 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Jul 2009 20:37:34 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id DA29A29A0015 for ; Wed, 22 Jul 2009 13:37:14 -0700 (PDT) Message-ID: <335317178.1248295034892.JavaMail.jira@brutus> Date: Wed, 22 Jul 2009 13:37:14 -0700 (PDT) From: "Uwe Schindler (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-1644) Enable MultiTermQuery's constant score mode to also use BooleanQuery under the hood In-Reply-To: <1799751446.1242738285589.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734301#action_12734301 ] Uwe Schindler commented on LUCENE-1644: --------------------------------------- Hi Mike, patch looks good. I was a little bit confused about the high term number cut off, but it is using Math.max to limit it to the current BooleanQuery max clause count. Some small things: bq. OK I made getEnum protected again. ...but only in MultiTermQuery itsself. Everywhere else (even in the backwards compatibility override test [JustCompile] it is public). Also the current singletons are not really singletons, because queries that are unserialized will contain instances that are not the "singleton" instances :) - and will therefore fail to produce correct hashcode/equals tests. The problem behind: The singletons are serializable but do not return itsself in readResolve() (not implemented). All singletons that are serializable must implement readResolve and return the singleton instance (see Parameter base class or the parser singletons in FieldCache). > Enable MultiTermQuery's constant score mode to also use BooleanQuery under the hood > ----------------------------------------------------------------------------------- > > Key: LUCENE-1644 > URL: https://issues.apache.org/jira/browse/LUCENE-1644 > Project: Lucene - Java > Issue Type: Improvement > Components: Search > Reporter: Michael McCandless > Assignee: Michael McCandless > Priority: Minor > Fix For: 2.9 > > Attachments: LUCENE-1644.patch, LUCENE-1644.patch, LUCENE-1644.patch > > > When MultiTermQuery is used (via one of its subclasses, eg > WildcardQuery, PrefixQuery, FuzzyQuery, etc.), you can ask it to use > "constant score mode", which pre-builds a filter and then wraps that > filter as a ConstantScoreQuery. > If you don't set that, it instead builds a [potentially massive] > BooleanQuery with one SHOULD clause per term. > There are some limitations of this approach: > * The scores returned by the BooleanQuery are often quite > meaningless to the app, so, one should be able to use a > BooleanQuery yet get constant scores back. (Though I vaguely > remember at least one example someone raised where the scores were > useful...). > * The resulting BooleanQuery can easily have too many clauses, > throwing an extremely confusing exception to newish users. > * It'd be better to have the freedom to pick "build filter up front" > vs "build massive BooleanQuery", when constant scoring is enabled, > because they have different performance tradeoffs. > * In constant score mode, an OpenBitSet is always used, yet for > sparse bit sets this does not give good performance. > I think we could address these issues by giving BooleanQuery a > constant score mode, then empower MultiTermQuery (when in constant > score mode) to pick & choose whether to use BooleanQuery vs up-front > filter, and finally empower MultiTermQuery to pick the best (sparse vs > dense) bit set impl. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org