lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Repeatability of results
Date Wed, 04 Apr 2012 23:18:08 GMT
On Wed, Apr 4, 2012 at 6:15 PM, Alan Bawden <alan@basistech.com> wrote:
> So I sat down to try to make a small test case that exhibited this
> behavior, and while I was working on that I thought of a possible
> explanation for what we are seeing.  If you agree that my explanation is
> what's going on here, then Benson and I can stop working on making a test
> case, and move on to figuring out how we can live with what may be
> unavoidable behavior.
>
> The key observation is that the differences in scores we see are always
> down around the sixth decimal place -- down where 32-bit floating point
> loses precision.  So what we're seeing seems likely to simply be the result
> of the fact that floating point addition isn't associative.
>
> In theory, the order of the documents in an index doesn't matter when
> computing a score, but if the documents are stored in a different order,
> any quantity that is computed using floating point by iterating over the
> set of documents may come out differently due to changes in the order in
> which the documents are processed.
>
> So could something like this cause what we are seeing?

OK this could make sense (floating point math is frustrating!).

But, Lucene generally scores one document at a time, so in theory just
changing its docid shouldn't alter the order of float operations.

Or... is it possible the query is different?  EG a BooleanQuery w/
SHOULD clauses in a different order?

Maybe try explain() in the two cases and compare?

Hmm, another idea: are you doing deletions in the test?  Or only adds?

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message