lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Paging with Lucene
Date Fri, 21 Jan 2011 13:58:29 GMT
You can write a custom Collector that does this (just not delegating the
collect(int) call) and wrap TopDocsCollector with that.

Alternative: Plug in a Filter that filters your documents during the query.
As doing this on iterating hits is often costly, the ideal solution would be
to create a cachedLlucene Filter that stores the documents allowed to access
in an OpenBitSet and returns that on getDocIdSet().

When implementing such a filter (and also the collector above), remember
that Lucene works on index segments so please use the IndexReader and
docBase passed to Collector.setNextReader() and Filter.getDocIdSet().

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Clemens Wyss [mailto:clemensdev@mysign.ch]
> Sent: Friday, January 21, 2011 2:32 PM
> To: java-user@lucene.apache.org
> Subject: AW: Paging with Lucene
> 
> The problem is, that due to the "filtering" AFTER having searched the
index,
> we don't know how many TopDocs to read in order have "enough" for page
> x.
> 
> Does lucene's search allow injecting kind of a "voter"/"vetoer", which is
> called for any hit (ScoreDoc) lucene has encountered. This voter should be
> able to reject the given hit from TopDocs (by returning false). Can this
be
> done with a custom Filter?
> This of course comes with a performance penalty, but it would allow to
> search for n ScoreDocs, even in our case...
> 
> > -----Urspr√ľngliche Nachricht-----
> > Von: Ian Lea [mailto:ian.lea@gmail.com]
> > Gesendet: Freitag, 21. Januar 2011 11:10
> > An: java-user@lucene.apache.org
> > Betreff: Re: Paging with Lucene
> >
> > The standard recommendation for paging is to re-execute the search for
> > second and subsequent pages and return the second or subsequent chunk
> > of hits.  Would that not work in your case?
> >
> > An alternative is to read and cache hits from the initial search but
> > that is generally more complex.
> >
> >
> > --
> > Ian.
> >
> > On Thu, Jan 20, 2011 at 7:36 AM, Clemens Wyss <clemensdev@mysign.ch>
> > wrote:
> > > (thanks fort he many answers to my initial lucene question "Best
> > > practices for multiple languages?")
> > >
> > > We shall be confronted with the followong problem:
> > > due to the very dynamic access rules on our content, we shall not be
> > > able
> > to formulate these in/as Filter(s).
> > > Hence we need to first search and then apply the access rules (i.e.
> > > security
> > filter).
> > > What is the best approach to implement paging in this situation? Not
> > > to forget, the "overall context" is a web app ;-)
> > >
> > > Thx for your advices
> > > - Clemens
> > >
> > > --------------------------------------------------------------------
> > > - To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message