lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anson Lau" <a...@fulfil-net.com>
Subject RE: using boost factor
Date Wed, 23 Jun 2004 10:59:38 GMT
Hi guys,

It seems like to really customise the scoring in lucene, one will have to go
into the lucene source.

I spend a fair bit of time looking into this and it seems to me not the full
scoring api is exported.  The formula documented on the Similarity class
seems to explain how a term is scored, but not, for example, how the final
score on a Boolean query is computed from each individual component. (Please
correct me if I'm wrong).  Normalisation is another part where the API is
not exported.

Anson

-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com] 
Sent: Wednesday, June 23, 2004 3:51 AM
To: Lucene Users List
Subject: Re: using boost factor

Hello Anson,

I would look at IndexSearcher's explain method:
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/IndexSear
cher.html#explain(org.apache.lucene.search.Query,%20int)

This should give you insight into what's contributing to the high/low
scores, thus telling you what you can tweak.  Perhaps it's just the
boost, perhaps some other similarity factors.

Using explain should provide you information such as this, for example:
http://www.mozdex.com/explain.jsp?idx=2&id=2067257&query=goober

I hope this helps.  Somebody else will probably be able to give more
information, but this should get you started while you wait.

Otis

--- Anson Lau <alau@fulfil-net.com> wrote:
> Hi guys,
> 
> Lets say I want to search the term "hello world" over 3 fields with
> different boost:
> 
> ((hello:field1 world:field1)^0.001 (hello:field2 world:field2)^100
> (hello:field3 world:field3)^20000))
> 
> Note I've given field1 a really low boost, a heavy boost to field2
> and a
> REALLY heavy boost to field3.
> 
> What is happening to me is that a term that matches both field1 and
> field2,
> will have a higher score than a term that matches field3 only, even
> though
> field3's boost is WAY higher.
> 
> Can I change this behaviour such that the match in field3 only will
> actually
> have a higher score because of the boost?
> 
> Thanks,
> 
> Anson


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message