Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 22246 invoked from network); 8 Feb 2003 23:37:07 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 8 Feb 2003 23:37:07 -0000 Received: (qmail 10613 invoked by uid 97); 8 Feb 2003 23:38:44 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@nagoya.betaversion.org Received: (qmail 10606 invoked from network); 8 Feb 2003 23:38:43 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 8 Feb 2003 23:38:43 -0000 Received: (qmail 21951 invoked by uid 500); 8 Feb 2003 23:37:05 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 21938 invoked from network); 8 Feb 2003 23:37:04 -0000 Received: from maynard.mail.mindspring.net (207.69.200.243) by daedalus.apache.org with SMTP; 8 Feb 2003 23:37:04 -0000 Received: from h-66-167-144-186.mclnva23.covad.net ([66.167.144.186] helo=Stete03) by maynard.mail.mindspring.net with smtp (Exim 3.33 #1) id 18heX1-00024w-00 for lucene-user@jakarta.apache.org; Sat, 08 Feb 2003 18:37:11 -0500 Message-ID: <00bf01c2cfca$e0272380$0201a8c0@netframe.com> From: "Terry Steichen" To: "Lucene Users List" References: <20030125070947.26121.qmail@web12704.mail.yahoo.com> <006801c2c557$cce547c0$0201a8c0@netframe.com> <3E440B0E.8060301@lucene.com> Subject: Re: Computing Relevancy Differently Date: Sat, 8 Feb 2003 18:36:16 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4522.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Doug, Can you give me an idea of what to replace the lengthNorm() method with to, for example, remove any special weight given to shorter matching documents? I can certainly go through a bunch of trial-and-error efforts, but it would help if I had some grasp of the logic initially. For example, from DefaultSimilarity, here's the lengthNorm() method: public float lengthNorm(String fieldName, int numTerms) { return (float)(1.0 / Math.sqrt(numTerms)); } Should I (for the purpose of eliminating any size bias) override it to always return a 1? How would I boost the headline field here? Is that how you are supposed to use the (presently unused) fieldName parameter? If that's the case, I assume I would logically (to do what I'm trying to do) make this factor greater than 1 for the 'headline' field, and 1 for all other fields? Regards, Terry ----- Original Message ----- From: "Doug Cutting" To: "Lucene Users List" Sent: Friday, February 07, 2003 2:37 PM Subject: Re: Computing Relevancy Differently > Terry Steichen wrote: > > I read all the relevant references I could find in the Users (not > > Developers) list, and I still don't exactly know what to do. > > > > What I'd like to do is get a relevancy-based order in which (a) longer > > documents tend to get more weight than shorter ones, (b) a document body > > with 'X' instances of a query term gets a higher ranking than one with fewer > > than 'X' instances. and (c) a term found in the headline (usually in > > addition to finding the same term in the body) is more highly ranked than > > one with the term only in the body. > > In the latest sources this can all be done by defining your own > Similarity implementation. You can make longer documents score higher > by overriding the lengthNorm() method. You can boost headlines there, > or with Field.setBoost(), or at query time with Query.setBoost(). > > Doug > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org