lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: Lucene Queries Over User-Editable Dynamic Categories of Documents
Date Thu, 25 Oct 2007 10:00:04 GMT
There are 2 considerations when caching filter results:

1) What was the criteria used to produce the results?
2) What version of the index were these results taken from?

CachingWrapperFilter takes care of 2) by using WeakHashMap keyed on IndexReader.

The filter you pass to CachingWrapperFilter must play it's part in taking care of 1) by implementing
hashcode/equals.

You can then maintain an LRU hashmap of CachingWrapperFilters to keep only the most popular
filters.

Incidentally, contrib's XMLQueryParser handles all this for you with a simple "CachedFilter"
tag:

<FilteredQuery>
    <Query>
        <UserQuery>"Brittany Spears"</UserQuery>
    </Query>    
    <Filter>
        <CachedFilter>
            <RangeFilter fieldName="date" lowerTerm="19970409" upperTerm="19970412"/>
        </CachedFilter>
    </Filter>    
</FilteredQuery>


Cheers
Mark

----- Original Message ----
From: lucene user <luz290@gmail.com>
To: java-user@lucene.apache.org
Sent: Thursday, 25 October, 2007 10:36:14 AM
Subject: Re: Lucene Queries Over User-Editable Dynamic Categories of Documents

What do you means by 'Most caches are held in WeakHashMap...' is this
caching provided by CachingWrappingFilter or do we have to implement it
ourselves? I assume the former.

We will share results of our testing as soon as we have any - not sure
 how
generalizable they will be.

You have been super helpful! Very grateful! Thanks!

On 10/24/07, markharw00d <markharw00d@yahoo.co.uk> wrote:
>
> lucene user wrote:
> > Thanks for all your help!
> >
> > We are using Lucene 2.1.0 and TermsFilter seems to be new in Lucene
> 2.2.0.
> > I have not been able to find SortedVIntList in the javadocs at all.
> >
>
> No, SortedVIntList is in the patch I provided a link to earlier.
>
>
> > Because both SortedVIntList and a regular BitSet are based on
 Lucene
> > Document Numbers, which are not permanent, It seems we will need to
> > generate these objects fresh at least once per session. Any
 comments,
> > about that? Do I have that correct?
> >
> Yes. Most caches tend to be held in WeakHashMap keyed on IndexReader
 so
> that when a new reader takes over old caches are automatically
 garbage
> collected.
>
>
> > Our application includes the following filter implementation that
 we use
> for
> > a
> > slightly different end user category problem. We could easily use
 it for
> our
> > current problem as well.
> >
> > Is TermsFilter sufficiently better (faster, more compact, more
 correct,
> > etc.) to make upgrading
> > very important?
> >
> TermsFilter is in "contrib" and is stand-alone so should work with
 most
> Lucene versions.
> Your implementation looks to scan the whole termEnum whereas
 TermsFilter
> looks up only the selected terms using reader.termDocs(term).
> Benchmarking will tell you which is faster. I'd be interested to know
> the results.
>
> Cheers
> Mark
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>





      ___________________________________________________________ 
Want ideas for reducing your carbon footprint? Visit Yahoo! For Good  http://uk.promotions.yahoo.com/forgood/environment.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message