lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trejkaz <trej...@trypticon.org>
Subject Performance of IndexSearcher.explain(Query)
Date Tue, 20 Nov 2012 23:18:44 GMT
I have a feature I wanted to implement which required a quick way to
check whether an individual document matched a query or not.

IndexSearcher.explain seemed to be a good fit for this.

The query I tested was just a BooleanQuery with two TermQuery inside
it, both with MUST. I ran an empty query to match all documents and
then ran the new code against each document. Within 40,743 documents,
1,072 documents matched the query.

I got the times of around 15.5s doing this. After noticing that
ConstantScoreQuery now works with Query in addition to Filter, I
started using it as well, which further reduced this time to 13.6s.

There is a comment like this on the explain method, though:

    "Computing an explanation is as expensive as executing
     the query over the entire index."

So I wanted to test this. To do this, I made a collector which did
nothing but look for the single item being matched.

Times for searching the whole index using this collector came to
around 30.9s, which is more than twice as slow as using explain (times
didn't vary at all if I used ConstantScoreQuery here, which I assume
is something to do with using a custom collector which is ignoring the
scorer.)

So I was wondering, is this comment just out of date? It seems that by
using explain(), I get the same information I get by querying the
whole index, *plus* information about the score which the custom
collector wasn't recording, all in less than half the time it took to
query the whole index.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message