Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 65030 invoked from network); 15 Sep 2008 12:09:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Sep 2008 12:09:39 -0000 Received: (qmail 73684 invoked by uid 500); 15 Sep 2008 12:09:26 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 73654 invoked by uid 500); 15 Sep 2008 12:09:26 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 73633 invoked by uid 99); 15 Sep 2008 12:09:26 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Sep 2008 05:09:26 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [213.52.246.188] (HELO MAIL.DIOSPHERE.com) (213.52.246.188) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 15 Sep 2008 12:08:28 +0000 Subject: RE: Sorting in lucene through Document boosting MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable Date: Mon, 15 Sep 2008 13:08:39 +0100 Message-ID: Content-class: urn:content-classes:message X-MimeOLE: Produced By Microsoft Exchange V6.5 X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Sorting in lucene through Document boosting Thread-Index: AckV9Pmf3vw+VMXHToKhOdI+rS/WhQBNhZtw References: From: "Dragan Jotanovic" To: X-Virus-Checked: Checked by ClamAV on apache.org Thanks Chris. I made simple Similarity implementation: public float lengthNorm(String arg0, int arg1) { return 1f; } public float tf(float arg0) { return 1f; } My boost values are calculated simply by calling:=20 document.setBoost(DefaultSimilarity.decodeNorm((byte)rank)); It works perfectly. I just need to check if I gain something with this, in terms of performance and resource consumption. -----Original Message----- From: Chris Hostetter [mailto:hossman_lucene@fucit.org]=20 Sent: Saturday, September 13, 2008 11:59 PM To: java-user@lucene.apache.org Subject: Re: Sorting in lucene through Document boosting : I thought of setting boost value for documents at index time, with the : value of my sort field, and then making custom Similarity class which : would disregard Lucene scoring and take in evaluation only this document : boost. the general idea should work, but a few things to pay attention to... 1) document boosts are folded into the fieldNorm, so make sure you don't "setOmitNorms(true)" 2) your lengthNorm function needs to return a constant 3) you'll need to adjust your boost values so that when the fieldNorms are=20 converted to the internal 'byte' representation they are still unique ...=20 with some simple experimentation you can find an approach that helps you genreate a mapping from 1,2,3,4,5... to a,b,c,d,... where a