lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <reng...@ix.netcom.com>
Subject Re: MultiSegmentQueryFilter enhancement for interactive indexes?
Date Sat, 08 Jul 2006 03:01:39 GMT
Exactly. I have been watching to see how the new filer interface  
works out for 2.0. I am still not certain why it is so involved.

I still think

interface Filter {
    boolean include(int doc);
    int nextInclude(int doc);
}

should suffice.

On Jul 7, 2006, at 9:53 PM, Yonik Seeley wrote:

> This might be even better in conjunction with moving away from BitSet
> to some sort of interface like DocNrSkipper... that way you would
> never have to combine the filters into a single BitSet.
>
>
> -Yonik
> http://incubator.apache.org/solr Solr, the open-source Lucene  
> search server
>
> On 7/7/06, robert engels <rengels@ix.netcom.com> wrote:
>> I implemented it and it works great. I didn't worry about the
>> deletions since by the time a filter is used the deleted documents
>> are already removed by the query. The only problem that arose out of
>> this was for things like the ConstantScoreQuery (which uses a filter)
>> - I needed to modify this query to ignore deleted documents.
>>
>> Now I have incremental cached filters - the query performance is
>> going through the roof.
>>
>>
>>
>> On Jul 7, 2006, at 2:47 PM, Chris Hostetter wrote:
>>
>> >
>> > I'm no segments/MultiReader expert, but your idea sounds good to
>> > me ... it
>> > seems like it would certainly work in the "new segments" situation.
>> >
>> > One thing i don't see you mention is dealing with deletions ... i'm
>> > not
>> > sure if deleting documents cause the version number of an
>> > IndexReader to
>> > change or not (if it does your job is easy) but even if it  
>> doesn't I'm
>> > guessing you could say that if hasDeletions() returns true, you
>> > have to
>> > assume you need to invalidate your cached bits (worst case scenerio
>> > you
>> > are invalidating the cache as often as it is now)
>> >
>> >
>> > : Date: Fri, 7 Jul 2006 00:32:54 -0500
>> > : From: robert engels <rengels@ix.netcom.com>
>> > : Reply-To: java-dev@lucene.apache.org
>> > : To: Lucene-Dev <java-dev@lucene.apache.org>
>> > : Subject: MultiSegmentQueryFilter enhancement for interactive
>> > indexes?
>> > :
>> > : I thought of a possible enhancement - before I go down the road,
>> > I am
>> > : looking for some input form the community?
>> > :
>> > : Currently, the QueryFilter caches the bits base upon the
>> > IndexReader.
>> > :
>> > : The problem with this is small incremental changes to the index
>> > : invalidate the cache.
>> > :
>> > : What if instead the filter determined that the underlying
>> > IndexReader
>> > : was a MultiReader and then maintained a bitset for each reader,
>> > : combining them in bits() when requested. The filter could  
>> check if
>> > : any of the underlying readers were the different (removed or  
>> added)
>> > : and then just create a new bitset for that reader. With the  
>> new non-
>> > : bit set filter implementations this could be even more memory
>> > : efficient since the bitsets would not need to be combined into a
>> > : single bitset.
>> > :
>> > : With the previous work on "reopen" so that segments are  
>> reused, this
>> > : would allow filters to be far more useful in a highly interactive
>> > : environment.
>> > :
>> > : What do you think?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message