lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Terry Steichen" <>
Subject Re: Computing Relevancy Differently
Date Sat, 08 Feb 2003 23:36:16 GMT

Can you give me an idea of what to replace the lengthNorm() method with to,
for example, remove any special weight given to shorter matching documents?
I can certainly go through a bunch of trial-and-error efforts, but it would
help if I had some grasp of the logic initially.

For example, from DefaultSimilarity, here's the lengthNorm() method:

  public float lengthNorm(String fieldName, int numTerms) {
    return (float)(1.0 / Math.sqrt(numTerms));

Should I (for the purpose of eliminating any size bias) override it to
always return a 1?

How would I boost the headline field here? Is that how you are supposed to
use the (presently unused) fieldName parameter?  If that's the case, I
assume I would logically (to do what I'm trying to do) make this factor
greater than 1 for the 'headline' field, and 1 for all other fields?



----- Original Message -----
From: "Doug Cutting" <>
To: "Lucene Users List" <>
Sent: Friday, February 07, 2003 2:37 PM
Subject: Re: Computing Relevancy Differently

> Terry Steichen wrote:
> > I read all the relevant references I could find in the Users (not
> > Developers) list, and I still don't exactly know what to do.
> >
> > What I'd like to do is get a relevancy-based order in which (a) longer
> > documents tend to get more weight than shorter ones, (b) a document body
> > with 'X' instances of a query term gets a higher ranking than one with
> > than 'X' instances. and (c) a term found in the headline (usually in
> > addition to finding the same term in the body) is more highly ranked
> > one with the term only in the body.
> In the latest sources this can all be done by defining your own
> Similarity implementation.  You can make longer documents score higher
> by overriding the lengthNorm() method.  You can boost headlines there,
> or with Field.setBoost(), or at query time with Query.setBoost().
> Doug
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message