lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trejkaz <trej...@trypticon.org>
Subject Does anyone have tips on managing cached filters?
Date Fri, 23 Nov 2012 04:10:24 GMT
I recently implemented the ability for multiple users to open the
index in the same process ("whoa", you might think, but this has been
a single user application forever and we're only just making the
platform capable of supporting more than that.)

I found that filters are being stored twice and since it's basically
the same filter and filters can be pretty large, I set out to try and
do something about that.

Problem is, I can't figure out when to invalidate the things.

With a single user it was easy. If the user tags an item, you
invalidate the TagFilter for that tag and of course the AnyTagFilter.
(Yes, we could instead update the filtered bitset immediately. That's
an improvement for another day.)

With multiple users I have two additional scenarios which I can't find
a single consistent solution to:

* Two users, each opening different indices. If the first user tags
something, it should only invalidate the filters for their readers and
not the other users'.

* Two users, opening the same index but one is looking at a newer
copy. So they might share some segments, but not all the segments. If
the first user tags something, it should invalidate all the filters
for that index, whether the first user has them open or not, otherwise
the second user will see out of date information.

The obvious trivial solutions satisfy exactly one of the above but not both:

1. when invalidating, walk the tree of index readers the user has open
and invalidate any filter cached for those readers. Suits the first
scenario but not the second.

2. just invalidate every doc ID set for every reader. Suits the second
scenario but not the first. but at least it is technically correct. It
won't give bad results, just bad performance. So it's the better of
the two at the moment, and probably still better than keeping the same
filter bit set in memory twice.

As for actually doing the invalidation, CachingWrapperFilter itself
doesn't appear to have any mechanism for invalidation at all, so I
imagine I will be building a variation of it with additional methods
to invalidate parts of the cache.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message