lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From luc...@triplehelix.org
Subject 2.4 Performance
Date Wed, 19 Nov 2008 02:39:01 GMT
On an index of around 20 gigs I've been seeing a performance drop of
around 35% after upgrading to 2.4 (measured on ~10000 requests
identical requests, executed in parallel against a threaded lucene /
apache setup, after a roughly 10000 query warmup). The principal
changes I've made so far are just to switch to NIOFSDirectories and
use read-only index readers.

Our design is roughly as follows: we have some pre-query filters,
queries typically involving around 25 clauses, and some
post-processing of hits. We collect counts and filter post query using
a hit collector, which uses the (now deprecated) bits() method of
Filters.

I looked at converting us to use the new DocIdSet infrastructure (to
gain the supposed 30% speed bump), but this seems to be somewhat
problematic as there is no guarantee for whether we will get back a
set we can do binary operations on (for example, if we get back a
SortedVIntList, we're pretty much out of luck - the cardinality of the set
is large (as it's a sortedvintlist), so we can't coerce it into
another type, and it doesn't have the set operations we need to use it
directly.

Has anyone else seen this? Is there anything else
we should be changing in the upgrade to 2.4?

Thanks,

-Matt

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message