lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <>
Subject Re: Query Boosting
Date Tue, 11 Aug 2009 09:52:38 GMT
Hi there,

well, where to start from.... I would suggest you look at the output
of Query#explain() first to see how the score is calculated. You might
use a simpler query to get started with it as this might be quite
cryptic if you see it the first time.
To completely understand what the output means have a closer look to
the javadoc of the class Similarity
this will explain how the score is calculated in the very detail.
Once you understand what is going on during the scoring process I
would suggest you revise your boosting. I don't know if you have field
boost set but it seems it would make more sense in your usecase as far
as I can tell.
In general make sure you understand what the different boosts are used
for - this snippet from the wiki might help you:
What is the difference between field (or document) boosting and query boosting?

Index time field boosts (field.setBoost(boost)) are a way to express
things like "this document's title is worth twice as much as the title
of most documents". Query time boosts (query.setBoost(boost)) are a
way to express "I care about matches on this clause of my query twice
as much as I do about matches on other clauses of my query".

Index time field boosts are worthless if you set them on every document.

Index time document boosts (doc.setBoost(float)) are equivalent to
setting a field boost on ever field in that document.
</snip> (

hope that helps to get started with scoring etc.


On Tue, Aug 11, 2009 at 10:50 AM, bourne71<> wrote:
> Hi,
> I am fairly new to Lucene and have encounter a problem with the search
> function i am trying to create using Lucene.  When I search, lets say "news
> sharing", then the results return and display.
> Its fine up to this point until I check the ranking. Some results, although
> match only 1 of the 2 keywords, will have higher ranking. The problem is
> like describe below:
> Page 1
> news - Total found 23
> sharing - Total found 0
> Page 2
> news - Total found 1
> sharing - Total found 21
> This is understandable why Page 1 got better ranking, bcs it has more
> keyword found. But this will make the results return to be less relevant
> My current query is like the following:
> (url:sharing^2.0 content:sharing title:sharing^1.5) (url:news^2.0
> content:news title:news^1.5) url:"sharing news"~2147483647^2.0
> content:"sharing news"~2147483647 title:"sharing news"~2147483647^1.5
> Is there anyway I can add an additional query that will give an additional
> boost to results that has both the keyword in it?
> --
> View this message in context:
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message