lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Taylor <paul_t...@fastmail.fm>
Subject Score exact matches higher than matches that match analysed text but not original text
Date Tue, 10 Jan 2012 09:12:50 GMT
My analyser strips out accents as often these are not entered correctly, 
so assume there are two documents in the database with default field 
containing
República
Republica

a search for República or Republica will return both results, each with 
a score of 1.

Its correct that they both get returned but it would be really nice if 
at the scoring stage it could recognise that if I had search for 
República that the document containing República is a slightly better 
match than the other one and score slightly higher, and vice versa.

Is there are any way to do this in Lucene, alternatively I thought about 
augmenting the score results returned by Lucene, and when multiple 
results have the same score  check the number of matching letters and 
increase the score based on how many letters match, but only increase 
the score so still lower than any results that Lucene scored higher. I 
also realise that this seems to make sense when just searching one field 
but more complex when the query is searching over multiple fields but I 
think in this case when searching for artists/bands (music) I would only 
do the boost if the artist name was one of the search fields.

Paul

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message