lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hilton Campbell" <hilton.campb...@gmail.com>
Subject RE: How can I search over all documents NOT in a certain subset?
Date Thu, 07 Jun 2007 13:17:32 GMT
Yes, that's actually come up.  The document ids are indeed changing which is
causing problems.  I'm still trying to work it out myself, but any help
would most definitely be appreciated.

Thanks,
Hilton Campbell

-----Original Message-----
From: Antony Bowesman [mailto:adb@teamware.com] 
Sent: Wednesday, June 06, 2007 11:36 PM
To: java-user@lucene.apache.org
Subject: Re: How can I search over all documents NOT in a certain subset?

Steven Rowe wrote:
> Conceptually (caveat: untested), you could:
> 
> 1. Extend Filter[1] (call it DejaVuFilter) to hold a BitSet per
> IndexReader.  The BitSet would hold one bit per doc[2], each initialized
> to true.
> 
> 2. Unset a DejaVuFilter instance's bit for each of your top N docs by
> walking the TopDocs returned by Searcher.search(Query,Filter,int)[3].
> Initially, you could pass in null for the Filter, and then for all
> following calls, an instance of DejaVuFilter.

Just a thought...

If Hilton wants to be aware of new Documents in the index since the previous

search, this requires opening a new IndexReader.

If only Documents have been added to the index I expect, but am not sure,
that 
the bits from the old IndexReader are still valid for the document numbers
in 
the new Reader.  However, if there have been deletions or optimisation has 
occurred between reader instances, then the document ids from the old reader
may 
not represent the same documents in the new reader, so the Filter for the
old 
reader will not be valid for the new search against the new reader and you
may 
get false matches.

I don't think there will be a problem if there are no deletions.

Antony






---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message