lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruce Ritchie <br...@jivesoftware.com>
Subject Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)
Date Mon, 15 Sep 2003 17:36:08 GMT
Doug Cutting wrote:
> Why do you open multiple IndexReaders against a single index? Ordinarily 
> I would only expect an application to open a new index reader when the 
> index has changed, or in order to do deletions.  In both of these cases, 
> the cache would work correctly.

The source reason why I'm using multiple readers was that I was hitting a synchronization
issue with 
hits.doc(i) blocking across multiple threads on a busy customer site causing searches to become

slower and slower as more searches were attempted simultaneously. I believe the root cause
was that 
SegmentReader.document(i) was synchronized (I could be wrong, it's been a while), however
I didn't 
have time to look into the core code of Lucene when opening multiple readers was such a simple

solution and proved to solve the issue. Of course, now that I've got a (bit) more time it
might be 
worthwhile to investigate alternatives :)

> Note that an open index reader uses much more memory than another bit 
> vector in a cache will.  It caches a byte per document for each field 
> you've searched, plus 1/128th of all the terms in the index.  So, e.g., 
> the cached bit vectors could become dominant if you use more than eight 
> caches and only search a single field.

I knew that readers are relatively heavy however the real issue with using multiple readers
proved 
to be file descriptors, not memory usage (I'd really love a performant solution to that issue).
I've 
got the number of readers set to a max of 3 by default and configurable if need be.

In this case it's not the fact that the cache may have 'duplicate' values in it for the same
filter 
that I'm concerned about, but rather that a cache miss can be so painful (the slowness of
DateFilter 
over a large index impacting search performance to the order of seconds is an example).


Regards,

Bruce Ritchie

Mime
View raw message