lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Bowesman <...@teamware.com>
Subject Re: Using Lucene partly as DB and 'joining' search results.
Date Fri, 11 Apr 2008 22:03:13 GMT
Paul Elschot wrote:
> Op Friday 11 April 2008 13:49:59 schreef Mathieu Lecarme:

>> Use Filter and BitSet.
>>  From the personnal data, you build a Filter
>> (http://lucene.apache.org/java/2_3_1/api/org/apache/lucene/search/Fil
>> ter.html) wich is used in the main index.
> 
> With 1 billion mails, and possibly a Filter per user, you may want to
> use more compact filters than BitSets, which is currently possible
> in the development trunk of lucene.

Thanks for the pointers.  I've already used Solr's DocSet interface in my 
implementation, which I think is where the ideas for the current Lucene 
enhancements came from.  They work well to reduce the filter's footprint.  I'm 
also caching filters.

The intention is that there is a user data index and the mail index(es).  The 
search against user data index will return a set of mail Ids, which is the 
common key between the two.  Doc Ids are no good between the indexes, so that 
means a potentially large boolean OR query to create the filter of labelled 
mails in the mail indexes.  I know it's a theoretical question, but will this 
perform?

The read only data and modifiable user data need to be kept separate because the 
RO data can easily be re-created, which means I can't just create the filter as 
part of the base search.

Regards
Antony





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message