lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <>
Subject Re: Lucene Queries Over User-Editable Dynamic Categories of Documents
Date Thu, 25 Oct 2007 10:00:04 GMT
There are 2 considerations when caching filter results:

1) What was the criteria used to produce the results?
2) What version of the index were these results taken from?

CachingWrapperFilter takes care of 2) by using WeakHashMap keyed on IndexReader.

The filter you pass to CachingWrapperFilter must play it's part in taking care of 1) by implementing

You can then maintain an LRU hashmap of CachingWrapperFilters to keep only the most popular

Incidentally, contrib's XMLQueryParser handles all this for you with a simple "CachedFilter"

        <UserQuery>"Brittany Spears"</UserQuery>
            <RangeFilter fieldName="date" lowerTerm="19970409" upperTerm="19970412"/>


----- Original Message ----
From: lucene user <>
Sent: Thursday, 25 October, 2007 10:36:14 AM
Subject: Re: Lucene Queries Over User-Editable Dynamic Categories of Documents

What do you means by 'Most caches are held in WeakHashMap...' is this
caching provided by CachingWrappingFilter or do we have to implement it
ourselves? I assume the former.

We will share results of our testing as soon as we have any - not sure
generalizable they will be.

You have been super helpful! Very grateful! Thanks!

On 10/24/07, markharw00d <> wrote:
> lucene user wrote:
> > Thanks for all your help!
> >
> > We are using Lucene 2.1.0 and TermsFilter seems to be new in Lucene
> 2.2.0.
> > I have not been able to find SortedVIntList in the javadocs at all.
> >
> No, SortedVIntList is in the patch I provided a link to earlier.
> > Because both SortedVIntList and a regular BitSet are based on
> > Document Numbers, which are not permanent, It seems we will need to
> > generate these objects fresh at least once per session. Any
> > about that? Do I have that correct?
> >
> Yes. Most caches tend to be held in WeakHashMap keyed on IndexReader
> that when a new reader takes over old caches are automatically
> collected.
> > Our application includes the following filter implementation that
 we use
> for
> > a
> > slightly different end user category problem. We could easily use
 it for
> our
> > current problem as well.
> >
> > Is TermsFilter sufficiently better (faster, more compact, more
> > etc.) to make upgrading
> > very important?
> >
> TermsFilter is in "contrib" and is stand-alone so should work with
> Lucene versions.
> Your implementation looks to scan the whole termEnum whereas
> looks up only the selected terms using reader.termDocs(term).
> Benchmarking will tell you which is faster. I'd be interested to know
> the results.
> Cheers
> Mark
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Want ideas for reducing your carbon footprint? Visit Yahoo! For Good

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message