Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 17628 invoked from network); 2 Dec 2008 22:43:13 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 2 Dec 2008 22:43:13 -0000 Received: (qmail 43043 invoked by uid 500); 2 Dec 2008 22:43:18 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 42027 invoked by uid 500); 2 Dec 2008 22:43:16 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 42016 invoked by uid 99); 2 Dec 2008 22:43:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Dec 2008 14:43:16 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.69.42.181] (HELO radix.cryptio.net) (208.69.42.181) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Dec 2008 22:41:48 +0000 Received: by radix.cryptio.net (Postfix, from userid 1007) id 522B671C27E; Tue, 2 Dec 2008 14:42:05 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by radix.cryptio.net (Postfix) with ESMTP id 4E8A471C15C for ; Tue, 2 Dec 2008 14:42:05 -0800 (PST) Date: Tue, 2 Dec 2008 14:42:05 -0800 (PST) From: Chris Hostetter To: java-user@lucene.apache.org Subject: Re: Query time document group boosting In-Reply-To: Message-ID: References: <1227714863.25256.38.camel@pc286> <1227777358.25256.70.camel@pc286> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Checked: Checked by ClamAV on apache.org : > "foo AND ( : > groupboost_A:dummy^10 OR : > groupboost_B:dummy OR : > groupboost_C:dummy^0.1 OR : > ... : > groupboost_Z:dummy : > )" : > : > With that query, it seems that only documents matching foo will result : > in a hit and be scored? : : Someone else than me needs to answer this. I know there is no optimization of : boolean clauses, that is why I'm saying this: it is possible that the boolean : query weight actually will be visiting all the inner clauses even though "foo" : was not matched, i.e. all documents in the index are visited but might not all : be scored. No, skipTo in the Scorer (ConjunctionScorer i believe) should ensure that none of the groupboost clauses will be visited for any doc that doesn't already match foo. i believe the pathological case you are talking about would be something like this... (docnum_is_odd AND foo) AND (docnum_is_even AND bar) assuming every doc matches either docnum_is_even or docnum_is_odd, but no doc matches both, then this query will match no documents, but every document will be visted by the scorer. : A cosmetic remark, I would personally choose a single field for the boosts and : then one token per source. (groupboost:A^10 groupboost:B^1 groupboost:C^0.1). that's a key improvement, as it helps keep the number of unique fields down, even if the number of sources grows without bounds. make sure you omitNorms on your groupboost field, and when buiding your various boolean queries, consider disabling the coord (check the docs to understand why that might make sense) : > > I think you are looking for CustomScoreQuery. not neccessarily ... CustomScoreQueries make a lot of sense when you want the score to be a function of the field value, but for simple "exists or not" fields a BooleanQuery works just as well. (if you wanted to index a weight per source field in each doc, then a CustomScoreQuery would certianly make more sense) -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org