lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shailendra Mudgal" <mudgal.shailen...@gmail.com>
Subject Improving Index Search Performance
Date Tue, 25 Mar 2008 12:43:18 GMT
Hi Everyone,

We are using Lucene to search on a index of around 20G size with around 3
million documents. We are facing performance issues loading large results
from the index. Based on the various posts on the forum and documentation,
we have made the following code changes to improve the performance:

i. Modified the code to use HitCollector instead of Hits since we will be
loading all the documents in the index based on keyword matching
ii. Added MapFieldSelector to load only selected fields(2 fields only)
instead of all the 14

After all these changes, it seems to be  taking around 90 secs to load 17k
documents. After profiling, we found that the max time is spent in *
searcher.doc(id,selector).

*Here is the code:

*                public void collect(int id, float score) {
                    try {
                        MapFieldSelector selector = new MapFieldSelector(new
String[] {COMPANY_ID, ID});
                        doc = searcher.doc(id, selector);
                        mappedCompanies = doc.getValues(COMPANY_ID);
                    } catch (IOException e) {
                        logger.debug("inside IDCollector.collect()
:"+e.getMessage());
                    }
                }*

*
*We also read in one of the posts that we should use bitSet.set(doc)
instead of calling searcher.doc(id). But we are unable to to understand how
this might help in our case since we will anyway have to load the document
to get the other required field(company_id). Also we observed that the
searcher is actually using only 1G RAM though we have 4G allocated to it.

Can someone suggest if there is any other optimization that can done to
improve the search performance on MultiSearcher. Any help would be
appreciated.

Thanks,
Vipin

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message