lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frederik Kraus <frederik.kr...@gmail.com>
Subject FilterQuery Performance Optimizations
Date Fri, 25 Feb 2011 20:53:16 GMT
Hi Guys, 

testing performance of complex filter queries on a rather large index, I ran into a few points
I'd like to share and put up for discussion:

Let's say we have the following two filter queries:

fq=someField:(123 OR 234 OR 235)
fq=someField:(234 OR 123 OR 235)

Currently the filterCache treats those two queries as two distinct queries, where really they
are logically the same.

Wouldn't it make more sense to internally sort this kind of logical OR query to reduce the
number of distinct queries and at the same time increase the cache hits?

This also applies to the "AND" case (multivalue), even though you can obviously circumvent
that issue via splitting:

fq=someField:(234 AND 123 AND 235)

into:

fq=someField:234&fq=someField:123&fq=someField:235

Going even one step further - might it not make sense to split up OR queries into individual
filterQueries (much like the AND case, but internally), and then creating a UNION instead
of an intersection as with the standard fq-chaining


Fred.




Mime
View raw message