incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Karman <pe...@peknet.com>
Subject Re: [lucy-dev] On Transactionality and Performance
Date Thu, 24 Mar 2011 18:17:55 GMT
David E. Wheeler wrote on 03/24/2011 01:01 PM:
> On Mar 23, 2011, at 9:47 PM, Peter Karman wrote:

>> Note the get_ks() method, which tracks a UUID per index and re-opens a new
>> searcher whenever the UUID changes.
> 
> Hrm. That might be useful. How do I access that from a (KinoSearch|Lucy)::Search::IndexSearcher
object? Is the UUID updated every time the index is changed?
> 

The Swish3 project uses a swish.xml file which it stores in the invindex
directory. It stores all the Swish3-specific metadata, including the
UUID. The swish.xml file is written by the indexer, as in here:

http://cpansearch.perl.org/src/KARMAN/SWISH-Prog-KSx-0.18/lib/SWISH/Prog/KSx/Indexer.pm

see the finish() method.

> 
>> In my pipeline, I have separate processes that serialize my incoming data
>> (analogous to unpacking .tar files and converting/normalizing their contents
>> into something index-able) and the indexers that actually parse/tokenize/insert
>> those documents. It's up to the searcher(s) (in my case) to detect whether they
>> should refresh themselves.
> 
> That's not a bad idea. I'll have to keep that in mind for the future.
> 

fwiw, you can see a working example of this system in action at:

 http://publicinsight.googlecode.com/


-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com

Mime
View raw message