lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nader, John P" <john.na...@cengage.com>
Subject Term browsing much slower in Lucene 3.x.x
Date Wed, 28 Jul 2010 18:39:44 GMT
We recently upgraded from lucene 2.4.0 to lucene 3.0.2.  Our load testing revealed a serious
performance drop specific to traversing the list of terms and their associated documents for
a given indexed field.  Our code looks something like this:

for(Term term : terms) {
TermDocs termDocs = indexReader.termDocs(term);
while(termDocs.next()) {   //  much slower here
    int doc = termDocs.doc();
    ...do something with each doc...
}


The slowness is all on the first call to TermDocs.next() for each term.  Further investigation
comparing 2.4.0 and 3.0.2 revealed that there is some new synchronization on the SegmentTermDocs
constructor and the SegmentReader.getTermsReader().  The first call to next() hits this synchronization,
causing a 4x slowdown on an 8 CPU machine.

My first question is should we be using a different approach to process each term's doc list
that would be more efficient?  The synchronization appears to be on aspects of these classes
that the next() operation is not concerned with.

My other question is whether there are planned performance enhancements to address this loss
of performance?

Thanks.

John



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message