lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ian Lea" <>
Subject Re: Improving Index Search Performance
Date Wed, 26 Mar 2008 10:45:41 GMT

The bottom line is that reading fields from docs is expensive.
FieldCache will, I believe, load fields for all documents but only
once - so the second and subsequent times it will be fast.  Even
without using a cache it is likely that things will speed up because
of caching by the OS.

If you've got plenty of memory vs index size you could look at
RAMDirectory or MMapDirectory.  Or how about some solid state disks?
Someone recently posted some very impressive performance stats.

Another approach we've used is to implement our own simple in-memory
cache of field values.  Read values from cache if present otherwise
read from lucene and cache. This works for us on an index of 3.5
million+ docs.  It helps a lot that we only update the index once a
day so only open new readers once a day and the cache has plenty of
time to fill up.


On Wed, Mar 26, 2008 at 9:12 AM, Shailendra Mudgal
<> wrote:
> Hi All,
>  Thanks for your reply. I would like to mention here is that the companyId is
>  a multivalued field. I tried paul's suggestions also but doesn't seem much
>  gain. Still the searcher.doc() method is taking almost the same amount of
>  time.
>  > you can use the FieldCache to lookup the compnayId for each doc.  on the
>  > aggregate this will most likely be much faster then accessing the stored
>  > fields.
>  >
>  As i understand the FieldCache, it will load fields for all the documents.
>  But in our case we want to load fields only for the matched documents.
>  Here is the code snippet after using the BitSet:
>                *public Map getIds() {
>                     MapFieldSelector selector = new MapFieldSelector(new
>  String[] {COMPANY_ID, ID});
>                     for(int i=bitSet.nextSetBit(0); i>=0; i=
>  bitSet.nextSetBit(i+1)) {
>                         try {
>                             doc = searcher.doc(i, selector);
>                             mappedCompanies = doc.getValues(COMPANY_ID);
>                         } catch (CorruptIndexException e) {
>                         } catch (IOException e) {
>                         }
>                     }
>                     return results;
>                 }
>  *
>  Any suggestions for further optimizing the code.
>  Thanks and Regards,
>  Vipin

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message