lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tsuraan <>
Subject Re: Batch searching
Date Wed, 22 Jul 2009 17:15:58 GMT
> It's not accurate to say that Lucene scans the index for each search.
> Rather, every Query reads a set of posting lists, each are typically read
> from disk. If you pass Query[] which have nothing to do in common (for
> example no terms in common), then you won't gain anything, b/c each Query
> will already read just the posting lists it needs.

That sounds like a lot of disk seeking, if the terms associated with
each query don't happen to fall in exact order.  My disks can sustain
100+ MB/s sequential read, but if they're seeking that number
plummets.  Would it be possible to order the queries so that they can
each read their index information in order, to minimize thrashing?

> If your Query[] contains the exact Query, it's redundant to run all these
> searches, since they will return the same results every time.

I'm assuming that the queries being run are different.  Caching query
results would be pretty easy for us though, so if the queries aren't
different, they could be made to be.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message