lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Using Lucene partly as DB and 'joining' search results.
Date Mon, 14 Apr 2008 23:34:10 GMT

: The archive is read only apart from bulk deletes, but one of the requirements
: is for users to be able to label their own mail.  Given that a Lucene Document
: cannot be updated, I have thought about having a separate Lucene index that
: has just the 3 terms (or some combination of) userId + mailId + label.
: 
: That of course would mean joining searches from the main mail data index and
: the label index.

tangential to the existing follwups about ways to use Filters efficiently 
to get some of the behavior, take a look at ParallelReader ... your use 
case sounds like it might be perfect for it: one really large main dataset 
that changes fairly infrequently, and what changes do occur are mainly 
about adding new records; plus a small "parallel" set of fields about 
each record in the main set which do change fairly frequently.

you build up an index for the main data, and then you periodicly build up 
a second index with the docs in the exact same order as the main index.

additions to the main index do't need to block on rebuilding the secondary 
index.  deletes do (since you need to delete from both indexes in parallel 
to keep the ids in sync) ... but that's ok since you said you only need 
occasional bulk deletes (you could process them as an initial step of your 
recuring rebuild of the smaller index).



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message