lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: Search performance using BooleanQueries in BooleanQueries
Date Tue, 06 Nov 2007 23:02:06 GMT
On Tuesday 06 November 2007 23:14:01 Mike Klaas wrote:
> On 29-Oct-07, at 9:43 AM, Paul Elschot wrote:
> > On Friday 26 October 2007 09:36:58 Ard Schrijvers wrote:
> >> +prop1:a +prop2:b +prop3:c +prop4:d +prop5:e
> >>
> >> is much faster than
> >>
> >> (+(+(+(+prop1:a +prop2:b) +prop3:c) +prop4:d) +prop5:e)
> >>
> >> where the second one is a result from BooleanQuery in
> >> BooleanQuery, and
> >> all have Occur.MUST.
> >
> > SImplifying boolean queries like this is not available in Lucene,
> > but it
> > would have a positive effect on search performance, especially when
> > prop1:a and prop2:b have a high document frequency.
>
> Wait--shouldn't the outer-most BooleanQuery provide most of this
> speedup already (since it should be skipTo'ing between the nested
> BooleanQueries and the outermost).  Is it the indirection and sub-
> query management that is causing the performance difference, or
> differences in skiptTo behaviour?

The usual Lucene answer to performance questions: it depends.

After every hit, next() needs to be called on a subquery before
skipTo() can be used to find the next hit. It is currently not defined which 
subquery will be used for this first next().

The structure of the scorers normally follows the structure of
the BooleanQueries, so the indirection over the deep subquery
scores could well  be relevant to performance, too.

Which of these factors actually dominates performance is hard
to predict in advance. The point of skipTo() is that is tries to avoid
disk I/O as much as possible for the first time that the query is
executed. Later executions are much more likely to hit the OS cache,
and then the indirections will be more relevant to performance.

I'd like to have a good way to do a performance test on a first
query execution, in the sense that it does not hit the OS cache
for its skipTo() executions, but I have not found a good way yet.

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message