lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan, Michael F. (LNG-DAY)" <michael.r...@lexisnexis.com>
Subject MemoryIndex slow for BooleanQuery with non-required clause
Date Mon, 23 Feb 2015 17:48:33 GMT
(I'm using Lucene 4.9.0)

I've been doing some perf testing of MemoryIndex, and have found that it is much slower when
a BooleanQuery contains a non-required clause, compared to when it just contains required
clauses.

Most of the time is spent in BooleanScorer, which as far as I can tell is an optimization
for scoring lots of documents, so it would make sense that it's not so good when scoring just
a single document.

I found that I'm able to greatly increase performance (non-required clause speed on par with
required clause speed) by changing the acceptsDocsOutOfOrder() method in MemoryIndex's collector
to return false instead of true, which causes BooleanScorer to not be used.

I did try out Lucene 5.0.0 and found that it is much faster, I think partially due to BooleanScorer
not being used if optional.size() == 0, which happens if there are no document hits. This
was changed here: http://svn.apache.org/viewvc/lucene/dev/tags/lucene_solr_5_0_0/lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java?r1=1651551&r2=1652034

I guess I don't really have a question. Just want to make other people aware of what I found.
Maybe there are other optimizations that can be made to avoid using BooleanScorer.

-Michael

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message