lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arjen van der Meijden <>
Subject Re: Does anyone have tips on managing cached filters?
Date Fri, 30 Nov 2012 07:00:34 GMT
We have something similar with documens that can be tagged (and have 
many other relations). But for the matter of search we have two 
distinctions from your aproach:
- We do actually index the relation's id (i.e. the tag's id) as part of 
the lucene-document and update the document if that relation betweenthe 
item and a tag is changed. So a filter on some 'tag' becomes a trivial 
termsFilter.addTerm('tagId', '12345).
- We use Lucene only as a base of the results we're going to send back 
to the user. I.e. we get results from Lucene and than do some more 
processing on them.

But that last distinction is actually because we started with an 
in-memory "database" application that did basically what Lucene already 
does, but just with more complicated objects and more complicated 
facet-extraction, more complicated filters, etc. So Lucene is only used 
when we need keyword-filtering and we help Lucene do that quickly by 
offering some Filters derived from the rest of the application's work.
And yes, if we were to redesign the application, it could become 
different :P

Best regards,


On 29-11-2012 6:57 Trejkaz wrote:
> On Wed, Nov 28, 2012 at 6:28 PM, Robert Muir <> wrote:
>> My point is really that lucene (especially clear in 4.0) assumes
>> indexreaders are immutable points in time. I don't think it makes sense for
>> us to provide any e.g. filtercaching or similar otherwise, because this is
>> a key simplification to the design. If you depart from this, by scoring or
>> filtering from mutable stuff outside the inverted index, things are likely
>> going to get complicated.
> Whereas it would be lovely to live in a land of rainbows and unicorns
> where all the data you ever want to use is in the text index and all
> filters can be written as a query, that simply isn't the case for us
> and I very much doubt we're not the only ones in this situation.
> Sure, things are complicated. Anything except the most trivial forum
> search application is complicated.
> Well, the situation as it stands now is that when a filter is
> invalidated, it happens across all stores which are currently open.
> That means that results are at least correct, but after invalidating a
> filter, a little more work than necessary is required to populate the
> cache again. For certain filters (like word lists) this is necessary
> anyway, since adding a word might invalidate any store. For others
> like tags, I was hoping there would be some way to selectively
> invalidate only certain readers. But it seems like that isn't the
> case, so I will probably have to add a third level of caching to cache
> these sorts of filter per-store instead of globally.
> TX
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message