lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <>
Subject Re: Filter updating
Date Sat, 29 Jul 2006 01:00:14 GMT
Oh, yeah, you're hearing a doubtful opinion because if this kind of thing
isn't done exactly correctly, it'd be particularly hard to debug. Keeping
things coordinated is hard <G>...

Given that you add/remove docs, you really don't want to just modify the
filter. Here's why....

All a filter is a bitset with a little plumbing. The bitset has a bit turned
on for each doc that should be considered by the query. So, say you have 16
docs, and docs 0, 3, 8 and 13 matched the criteria. These are the internal
Lucene doc IDs BTW. You would have a two-byte value with bits 0, 3, 8 and 13
on. Now you go ahead and remove a document, etc. You remove document 6 and
add a new document and optimize (?). Docs 7 to 15 get re-numbered as 6-14
and the new doc is doc 15. So your filter is no longer valid and your
searches will be incorrect.

The above may not be exact, but that's the flavor of what happens as I
understand it. There is no magic connection between the filter and the
index, you have to do it yourself. Given that any fancy dancing you did to
maintain the two in synch depends on the internals of Lucene, it just looks
like a solution that's the last thing you should try.

Filters are valid for the duration of the indexreader. As long as you don't
close the reader, you can re-use a filter as you please. But the reader
won't see any updates to the index until after you close/reopen it, so
that's not much comfort.

So, about messing around with the filter... Don't go there <G>....


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message