lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: Query Boosting
Date Tue, 11 Aug 2009 09:52:38 GMT
Hi there,

well, where to start from.... I would suggest you look at the output
of Query#explain() first to see how the score is calculated. You might
use a simpler query to get started with it as this might be quite
cryptic if you see it the first time.
To completely understand what the output means have a closer look to
the javadoc of the class Similarity
(http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/Similarity.html)
this will explain how the score is calculated in the very detail.
Once you understand what is going on during the scoring process I
would suggest you revise your boosting. I don't know if you have field
boost set but it seems it would make more sense in your usecase as far
as I can tell.
In general make sure you understand what the different boosts are used
for - this snippet from the wiki might help you:
<snip>
What is the difference between field (or document) boosting and query boosting?

Index time field boosts (field.setBoost(boost)) are a way to express
things like "this document's title is worth twice as much as the title
of most documents". Query time boosts (query.setBoost(boost)) are a
way to express "I care about matches on this clause of my query twice
as much as I do about matches on other clauses of my query".

Index time field boosts are worthless if you set them on every document.

Index time document boosts (doc.setBoost(float)) are equivalent to
setting a field boost on ever field in that document.
</snip> (http://wiki.apache.org/lucene-java/LuceneFAQ#head-246300129b9d3bf73f597facec54ac2ee54e15d7)

hope that helps to get started with scoring etc.

simon


On Tue, Aug 11, 2009 at 10:50 AM, bourne71<garylkc@live.com> wrote:
>
> Hi,
>
> I am fairly new to Lucene and have encounter a problem with the search
> function i am trying to create using Lucene.  When I search, lets say "news
> sharing", then the results return and display.
>
> Its fine up to this point until I check the ranking. Some results, although
> match only 1 of the 2 keywords, will have higher ranking. The problem is
> like describe below:
>
> Page 1
> news - Total found 23
> sharing - Total found 0
>
> Page 2
> news - Total found 1
> sharing - Total found 21
>
> This is understandable why Page 1 got better ranking, bcs it has more
> keyword found. But this will make the results return to be less relevant
>
> My current query is like the following:
> (url:sharing^2.0 content:sharing title:sharing^1.5) (url:news^2.0
> content:news title:news^1.5) url:"sharing news"~2147483647^2.0
> content:"sharing news"~2147483647 title:"sharing news"~2147483647^1.5
>
> Is there anyway I can add an additional query that will give an additional
> boost to results that has both the keyword in it?
> --
> View this message in context: http://www.nabble.com/Query-Boosting-tp24913967p24913967.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message