lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Halbert <>
Subject similarity function
Date Wed, 28 Oct 2009 13:29:29 GMT

Given a query with multiple terms, e.g. fish oil, and searching across
multiple fields e.g. 

query= fieldA:fish fieldA:oil fieldB:fish fieldB:oil  etc...

I don't want to give any more weight to documents that match the same
word multiple times (either in the same, or different fields). I am only
interested in lending additional weight to a match of both words (fish
and oil) in the SAME field.

So for example if I have documents:

fieldA=fish is good for you
fieldB=vegetable oil and sunflower oil is good for you 

and Doc2
fieldA=fish oil is good for you
fieldB=bla bla bla

with the default similarity I would have 3 term matches in document 1
(fish, oil, oil) and 2 in document 2 (fish, oil), but I only want to
count 2 term matches in document 1 (fish, oil) and I want to give
increased weight to the two matches in document 2 because they occur in
the same field (fieldA).

Any ideas? Is there a simple way to achieve this? (it goes without
saying I want to match both documents, i.e. don't want to use quotes
"fish oil")


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message