lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: ConjunctionScorer.doNext() overstays?
Date Thu, 01 Mar 2012 14:18:27 GMT
On Thu, Mar 1, 2012 at 8:49 AM, mark harwood <> wrote:
> I would have assumed the many int comparisons would cost less than the superfluous disk
accesses? (I bow to your considerable experience in this area!)
> What is the worst-case scenario on added disk reads? Could it be as bad as numberOfSegments
x numberOfOtherscorers before the query winds up?

Well, it depends -- the disk access is a one-time thing but the added
per-hit check is per-hit.  At some point it'll cross over...

I think likely the advance(NO_MORE_DOCS) will not usually hit disk:
our skipper impl fully pre-buffers (in RAM) the top skip lists I
think?  Even if we do go to disk it's likely the OS pre-cached those
bytes in its IO buffer.

> On the index I tried, it looked like an improvement - the spreadsheet I linked to has
the source for the benchmark on a second worksheet if you want to give it a whirl on a different

Maybe try it on a more balanced case?  Ie, N high-freq terms whose
freq is "close-ish"?  And on slow queries (I think the results in your
spreadsheet are very fast queries right?  The slowest one was ~0.95
msec per query, if I'm reading it right?).

In general I think not slowing down the worst-case queries is much
more important that speeding up the super-fast queries.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message