incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David E. Wheeler" <da...@kineticode.com>
Subject Re: [lucy-dev] On Transactionality and Performance
Date Thu, 24 Mar 2011 18:01:28 GMT
On Mar 23, 2011, at 9:47 PM, Peter Karman wrote:

> The index is definitely available for searching while the indexer is doing its
> work. The searcher will become stale though, as soon as the $indexer->commit()
> is called, and the existing searcher will not have access to the recently-added
> segment(s).

Got it.

> Here, for example, is how I manage searchers:
> http://cpansearch.perl.org/src/KARMAN/SWISH-Prog-KSx-0.18/lib/SWISH/Prog/KSx/Searcher.pm
> 
> Note the get_ks() method, which tracks a UUID per index and re-opens a new
> searcher whenever the UUID changes.

Hrm. That might be useful. How do I access that from a (KinoSearch|Lucy)::Search::IndexSearcher
object? Is the UUID updated every time the index is changed?

> Marvin's comments about the efficiency of indexers and the advantage of
> "batching up" your indexed documents is merely that: an advantage and an efficiency.

Sure.

> In my pipeline, I have separate processes that serialize my incoming data
> (analogous to unpacking .tar files and converting/normalizing their contents
> into something index-able) and the indexers that actually parse/tokenize/insert
> those documents. It's up to the searcher(s) (in my case) to detect whether they
> should refresh themselves.

That's not a bad idea. I'll have to keep that in mind for the future.

Thanks,

David



Mime
View raw message