lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Scoring with FunctionQueries?
Date Wed, 08 Mar 2006 02:19:53 GMT

: I tried both approaches, and both don't seem to do what I
: want. Perhaps I did not understand you properly.

>From what I can tell it looks like you understood me perfectly, I too am
baffled by the results you are getting.  I have a couple of thoughts:

1) check the raw core you get from these docs using a HitCollector and
compare thatwith the value from explain ... the explain info is calculated
through a parallel code path which differs from the normal search/score
code path and it's totally possible there are bugs (BooleanQuery for
example will happily deal with sub queries that return scores of <= 0, but
it's explain functionwill not ... i don't think that's the issue here, but
it may be similar.

2) Add some logging (or set some breakpoints) to your custom similarites
   queryNorm methods (and your getSimilarity methods) to see
   if/when/how-often the methods are being called.

2) Try eliminating some variables andd see what happens ...
   a) create concrete subclasses instead of using anonomous instances with
      overriden methods.
   b) don't bother using FunctionQuery, just use two seperate TermQueries
      with different getSimilarity() methods (FunctionQuery is fairly new
      ... there may be bugs in it, also this way if you still have a
      problem you have a use case that anyone with lucene familiarty will
      understand even if they've never seen FunctionQuery)

: I generated a small in-memory index (six documents) for testing your
: suggestions, with some text in field "content" and a numeric score in
: field "score". Following are the code I used and the explanations I
: obtained.

once you've tried the suggestions above, can you make send out a
selfcontained JUnit test showing the problems?

: Please see the code above. I have not delved into the depths of Lucene
: yet, but it seems that Lucene uses only one similarity instance for
: scoring all clauses in the boolean query, and doesn't honour the
: similarity instances provided by the individual clauses.

i just double checked, and i can't see anyway that could be happening --
but you're seeing something weird, so *something* isn't working the way i
thought.  as i said, if you can post a self contained unit test that
demonstrates the problem, then maybe someone can spot the glitch.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message