lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Filters with 2.9.4
Date Wed, 27 Apr 2011 11:33:48 GMT
Hi,

In Lucene trunk the Filter gets a ReaderContext which contain a doc base if
available.

For Lucene 2 and 3 this is not available. The Lucene 2.9 code did not change
documented behavior. The fact that Filters always got the top level reader
was never documented (it was just like that in early Lucene versions) and so
is no break. The same applies not only to filters, it also applies to
Scorers created by Queries. Those also don't know anything about the
top-level searcher (and they don't need). For a filter to work this is also
not an requirement - the IndexReader passed as parameter is self contained
and provides all information for processing the current segment). You should
simply fix your caching (which is much more effective after this change, as
the cache items don't get invalid after a reopen of an index where only few
segments changed).

I would suggest to correct your code and use CachingWrapperFilter.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Antony Bowesman [mailto:adb@thorntothehorn.org]
> Sent: Wednesday, April 27, 2011 1:22 PM
> To: dev@lucene.apache.org
> Subject: Re: Filters with 2.9.4
> 
> Hi Uwe,
> 
> Thanks for the reply.
> 
> Things are a bit tangled, because I've used early Solr stuff with DocSet
and
> have extensively used my own caching Filters because I couldn't get what I
> wanted with the standard versions a few years ago.  It will take a while
to
> undo that, but I'm working towards that.
> 
> However, it still seems to me that the Filter.getDocIdSet() method should
> also be given the docBase for the given reader.  It seems odd that the
> Collector has that knowledge but the Filter does not even though they are
> pretty closely related classes.
> 
> What do you think?
> Antony
> 
> 
> 
> On 19/04/2011 5:01 PM, Uwe Schindler wrote:
> > Hi Antony,
> >
> > Why not use CachingWrapperFilter together with a TermsFilter or
> > QueryWrapperFilter(TermQuery)? This Filter keeps track of all used
> > segment readers. So you build an instance:
> >   Filter f = new CachingWrapperFilter(new QueryWrapperFilter(new
> > TermQuery(new Term(...))));
> >
> > And reuse that filter instance with all queries, the user starts. No
> > need to hack the cache yourself. The above variant is much more
> > effective as it works better with reopen()'ed index readers (after
> > index changed), because it reuses the unchanged segment readers.
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> >> -----Original Message-----
> >> From: Antony Bowesman [mailto:adb@thorntothehorn.org]
> >> Sent: Tuesday, April 19, 2011 7:30 AM
> >> To: Lucene Dev
> >> Subject: Filters with 2.9.4
> >>
> >> Hi,
> >>
> >> Another migrate to 2.9.4 issue for me...
> >>
> >> When a search is done by a user, I collect a 'DocSet' of Documents
> >> for
> > that
> >> 'owner'  (Term("id", "XX)).  This is a single set for all Documents
> >> in the
> > index
> >> and NOT per reader.
> >>
> >> Then when searches are made I use caching Filters, but I use my
> >> master DocSet as a Filter for those chained Filters.  However, with
> >> 2.9, Filters
> > are
> >> now called per segment reader and there's a DocIdSet for each Reader.
> >> There is no way for the filter implementation to know the docBase for
> >> the passed reader, like the collector does.
> >>
> >> As the Javadocs for Filter.getDocIdSet imply, a Filter must only
> >> return
> > doc ids
> >> for the given reader.
> >>
> >> I am now stuck with a filter implementation that can no longer
> >> interset
> > the
> >> master bitset for my 'owners'.
> >>
> >> Was this envisaged during the changes and is there a way I can get
> >> hold of the docBase for an IndexReader.
> >>
> >> Thanks
> >> Antony
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message