Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 19385 invoked from network); 19 Nov 2006 20:34:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 19 Nov 2006 20:34:36 -0000 Received: (qmail 50783 invoked by uid 500); 19 Nov 2006 20:34:41 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 50381 invoked by uid 500); 19 Nov 2006 20:34:40 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 50370 invoked by uid 99); 19 Nov 2006 20:34:40 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 19 Nov 2006 12:34:40 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [169.229.70.167] (HELO rescomp.berkeley.edu) (169.229.70.167) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 19 Nov 2006 12:34:28 -0800 Received: by rescomp.berkeley.edu (Postfix, from userid 1007) id F33C05B770; Sun, 19 Nov 2006 12:34:07 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by rescomp.berkeley.edu (Postfix) with ESMTP id E81FC7F403 for ; Sun, 19 Nov 2006 12:34:07 -0800 (PST) Date: Sun, 19 Nov 2006 12:34:07 -0800 (PST) From: Chris Hostetter To: java-user@lucene.apache.org Subject: Re: Boost Document In-Reply-To: <359a92830611180752o7fae0074j5a057c9ae05fbdf0@mail.gmail.com> Message-ID: References: <7405959.post@talk.nabble.com> <7417101.post@talk.nabble.com> <359a92830611180752o7fae0074j5a057c9ae05fbdf0@mail.gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Checked: Checked by ClamAV on apache.org : with scoring. Each time you index a document, it get a doc id greater than : any already in the index, and they get reassigned if you delete docs and : optimize.... They *may* be used when scoring to break ties but that doesn't : do you any good....... they aren't really used to break ties -- that implies there is code somewhere that deliberately wants the order to be deterministic and when it sees two identicle scores, then does a sort on docid. In reality, the fact that docs come back in docid order is just a side effect of the fact that the Scorers iterate over documents in order and that the sorting code generates a "stable sort" Looking at why the scores are teh same... : > ----- DOC 222-----home- : > 40960.0 = fieldNorm(field=WORD, doc=0) : > ----- DOC 111-----home- : > 40960.0 = fieldWeight(WORD:home in 1), product of: : > 40960.0 = fieldNorm(field=WORD, doc=1) : > > : doc1.setBoost(3163); : > > : doc2.setBoost(3150); ...norms (which is where field and doc boosts go) get encoded as a single byte, so they loose a lot of precision, unfortunately the methods used to translate from float->byte->float aren't subclassable (see Similarity.encodeNorm) if you write a bit of test code to call those mehtods on various values and see what you'll get you'll notice that bigger numbers are more affected then smaller numbers, so you may wnat to just use smaller boosts (ie: divide by 100) and see if that helps. alternately: stop using boosts to approach this problem, add these numbers as a new field, and use the FunctionQuery class from Solr to achieve your goal (search the list archives for more detailed Discussions of FunctionQuery) -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org