lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <>
Subject Which scorer to use for disjunctions?
Date Wed, 25 May 2005 19:25:19 GMT
Dear readers,

At the moment it's not clear to me which code
is best for scoring disjunctions:

There is a specialised priority queue for DisjunctionScorer:
This also contains:
- a btree implementation of BooleanScorer by Karl Wright
  that is probably the good for a small number of subscorers.
- performance measurement code in the TestDisjunctionPerf1

There is also BooleanScorer1:

I extended TestDisjunctionPerf1 to also exercise the btree
scorer, and the measurements are inconclusive: performance
of one scorer depends on the presence of others, which probably
means that the JIT is working irregurarly, even with -server and
-Xbatch as jvm options.
Also the relative order of the various scorers depends on the
number of subscorers.

TestDisjunctionScorer1 uses a set of test scorers like this:
  /** A scorer that matches all docs having a document number
   * that is a positive multiple of a given interval, up to a maximum.
The interval is normally chosen as a prime number and the test
starts from an array of these numbers, adding a test scorer 
for each interval in the array.

Could someone indicate a few typical cases to use for selecting
the best disjunction scorer?

Paul Elschot

I also tried getting this to work under gcj, but I'm having problems
with class loading from shared libraries. I got gcj/gij to work for
another project, so I'm trying to find the difference in the build files
that causes this. Is there perhaps someone else that has gcj/gij
working on the Lucene test cases?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message