lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: IndexReader plugins
Date Mon, 13 Apr 2009 12:08:25 GMT
On Mon, Apr 13, 2009 at 7:31 AM, Earwin Burrfoot <> wrote:
>> I think this (truly componentizing SegmentReader) makes tons of sense.
>>  After all, a SegmentReader is just a bunch of separate components
>> handling different parts of the index.
>> This is really orthogonal to LUCENE-831 (the field cache is just one
>> component).  They can land in either order...
>> Earwin do you want to take an initial stab (patch) at this?
> Okay. The only problem is, I'll take my time, can't cut on sleeping
> any more than now :)

I hear you ;)  And, that's no problem... each feature moves at its own
natural rate.  This is the natural way of open source.

>> I think it'll be interesting how the components API handles near
>> real-time search, because we want/expect components to be able to
>> merge themselves efficiently "in RAM" when possible.  EG if field
>> cache already has certain fields loaded, they can be merged in RAM; if
>> not, they should be merged on disk.  If field cache has pending
>> changes (in a future world when CSF makes it possible to suddenly
>> change say the price of certain documents), then the components must
>> properly implement clone (ideally incremental copy-on-write cloning).
> Can we outline some requirements for the plugin API?
> Do we want to attach/detach them to IndexReader after it is created,
> or only during construction?

I think I'd lean towards only at construction.  Seems dangerous to
allow swap in/out at some later time.

> We probably want them to support (know the difference between)
> SegmentReader/MultiSegmentReader.

Yeah... and I think it's fine if some components only "operate" at the
SegmentReader level.

> What about ParallelReader (does anybody use it at all?),
> FilterIndexReader, MultiReader?

Presumably these would use "aggregator components" that simply take N
components under the hood and merge their APIs.  (Like MultiTermEnum,

> For a hierarchy of readers, API should probably support the notion of
> different plugin instances per-subreader.

Yes, eg it needs to be easy for the N segments discovered on opening a
MultiSegmentReader to each ask for their own instance of

> Do we want plugins supporting more than one interface, or is it an
> unnecessary complication?
> Like:
> indexReader.bindPlugin(instance).to(Iface1.class, Iface2.class);
> And then:
> indexReader.plugin(Iface1.class) == indexReader.plugin(Iface2.class)

I would start without that, unless we have a clear "now" example that
requires it?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message