lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Taylor <paul_t...@fastmail.fm>
Subject Problem with scoring:is it absolute or relative ?
Date Wed, 02 May 2007 14:30:12 GMT
Hi I am having problems understanding lucenes scoring. I am using the 
Musicbrainz  which uses Lucene to provide searching facility over its 
data, which put simply consists of a database about recording artists , 
albums and song titles

I can construct a query such as:
    track:Minus AND artist:beck AND release:odelay

to return Tracks called Minus by the Artist Beck for the release Odelay, 
which works fine if all terms are correct however I am trying to 
implement the search in a program which takes the terms  from values 
embedded in existing audio files, so they may not be correct, so instead 
of using an AND search I want to use a OR search such as
    track:Minus  artist:beck  release:odelay
to improve the chances of getting a match, and then using the scoring 
returned work out whether the match is sufficiently simailr to accept it 
as a match.

Now the good news is that when multiple results are returned the record 
with the top score is normally the best match BUT the bad news is that 
if the match is very poor the top rated records normally return a score 
of 100 even though
only a few of the terms match.

    track:Minus  artist:rubbish  release:odelay

will return the Minus tracks by beck for the release odelay even through 
the artist term is incorrect because this is the best match it can make 
(2 out of 3 terms matched) but it is returning a score of 100 when I 
would expect a score
of 66 because only two of the terms match.

Why is it doing this, and how can I create a query that only returns 100 
if all terms match but will return possible matches where only some 
terms match ? I have spoke to the creators of MusicBrainz they say they 
are not using any kind of document/Field boosting and are that sure how 
its work as theve only added lucene to the site quite recently.

The above queries can be run tested from

http://musicbrainz.org/search.html

by:
    entering the query in the 'Search For' box
    selecting type:Track
    selecting Use advanced query syntax 
<http://musicbrainz.org/popup/TextSearchSyntax>

and clicking on Indexed Search

thanks Paul Taylor

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message