lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Nufire <luc...@ibink.com>
Subject Using a zero boost to prevent term for effecting score
Date Mon, 11 Jul 2005 03:19:13 GMT
Hello,

I am new to Lucene so my apologies in advance if what I am trying to do 
does not make sense or has been discussed before. I searched the list 
archives but couldn't find an answer....

First a bit of background.... I have a collection of documents which are 
indexed by SourceID and Content. In the UI, documents are displayed in 
folders which map to SourceIDs and by default all documents in a given 
source are displayed using a query like "+(source:1 source:2)". I also 
want to let users search for text in the Content and display results 
ranked by their Lucene score. Unfortunately, including the SourceID 
terms in my query effects the score I get back which in the context of 
my app does not make sense. I have thought about turning the SourceIDs 
terms into a QueryFilter but couldn't figure out how to get Lucene to 
return all of the documents in the filtered collection since empty 
queries are not allowed. As an alternative, I tried setting the boost on 
the SourceID terms to zero which seems to work -- my queries look 
something like "+((source:1 source:2)^0.0) +content:google".

So, my question is whether this approach is a supported method for 
getting the scorer to ignore a field in its calculations? If it is, then 
I may have found a bug in IndexSearcher.explain() which return "0.0 = 
match required" when asked to explain why a result got the score it did 
despite the fact that a non-zero score was passed to my hit collector 
for that item. Tracing through the code, it looks like the 
IndexSearcher.explain() method is unhappy with a required clause having 
a zero score. Since the core search algorithms don't prevent this, I was 
surprised to see this in IndexSearcher.explain(). The other problem that 
I am having with the searcher.explain() method is that I can't pass it 
the DateFilter that I use on some of my queries. Since that filter 
effects the score for documents in the results, it would be nice if 
IndexSearcher.explain() was able to take the filter into account. This 
would also be a problem if I moved the SourceIDs term into a filter as I 
have been considering.

Any help or insight on this issue will be greatly appreciated!

Thanks,
Tim

Mime
View raw message