lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Earwin Burrfoot <ear...@gmail.com>
Subject Re: IndexReader plugins
Date Mon, 13 Apr 2009 11:31:38 GMT
> I think this (truly componentizing SegmentReader) makes tons of sense.
>  After all, a SegmentReader is just a bunch of separate components
> handling different parts of the index.
>
> This is really orthogonal to LUCENE-831 (the field cache is just one
> component).  They can land in either order...
>
> Earwin do you want to take an initial stab (patch) at this?
Okay. The only problem is, I'll take my time, can't cut on sleeping
any more than now :)

> I think it'll be interesting how the components API handles near
> real-time search, because we want/expect components to be able to
> merge themselves efficiently "in RAM" when possible.  EG if field
> cache already has certain fields loaded, they can be merged in RAM; if
> not, they should be merged on disk.  If field cache has pending
> changes (in a future world when CSF makes it possible to suddenly
> change say the price of certain documents), then the components must
> properly implement clone (ideally incremental copy-on-write cloning).
Can we outline some requirements for the plugin API?

Do we want to attach/detach them to IndexReader after it is created,
or only during construction?
We probably want them to support (know the difference between)
SegmentReader/MultiSegmentReader.
What about ParallelReader (does anybody use it at all?),
FilterIndexReader, MultiReader?
For a hierarchy of readers, API should probably support the notion of
different plugin instances per-subreader.
Do we want plugins supporting more than one interface, or is it an
unnecessary complication?
Like:
indexReader.bindPlugin(instance).to(Iface1.class, Iface2.class);
And then:
indexReader.plugin(Iface1.class) == indexReader.plugin(Iface2.class)

> Mike
>
> On Sun, Apr 12, 2009 at 7:34 PM, Earwin Burrfoot <earwin@gmail.com> wrote:
>> To support my dream of kicking fieldCache out of the core and to add
>> some extensibility to Lucene, I want to introduce IndexReaderPlugins.
>> Rough pseudocode follows:
>>
>> interface IndexReaderPlugin {
>>        void attach(SegmentReader reader);
>>        void detach(SegmentReader reader);
>>
>>        void attach(MultiSegmentReader reader);
>>        void detach(MultiSegmentReader reader);
>> }
>>
>> IndexReader.java:
>> private Map<Class, IndexReaderPlugin> plugins;
>>
>> on opening/closing toplevel/segment reader we iterate over plugins:
>> for(IndexReaderPlugin plugin : plugins)
>>    plugin.attach(reader);
>>
>> the map is passed to toplevel reader initially, and then shared with
>> lowlevel readers, we can also retrieve plugins:
>> public <T> T plugin(Class<T> pluginType);
>>
>> then we can do something like:
>> indexReader.plugin(ValueSource.class).doSomething // lucene code
>> indexReader.plugin(FieldsCache.class).forField(LAST_UPDATE_TIME).doSomething
>> // my code
>> filter.apply(indexReader.plugin(FilterCache.class)) // my code
>>
>> Benefits are numerous. We get rid of alien code like:
>> +++ src/java/org/apache/lucene/index/SegmentReader.java (working copy)
>> @@ -83,6 +86,8 @@
>> +  protected ValueSource valueSource;
>> +
>> @@ -555,6 +560,8 @@
>> +
>> +      valueSource = new CachingValueSource(this, new
>> UninversionValueSource(this));
>>
>> If I don't need ValueSource attached to my readers, I won't have it.
>> If I need my custom caches attached to my readers, I can do it in a
>> natural way instead of hacking around MergeScheduler, or comparing
>> subreader lists.
>> If I want, I can replace Lucene's native ValueSource with my own
>> implementation, and all Lucene classes that use it will happily accept
>> it.
>>
>> On second thought, we shouldn't share plugin map across subreaders. If
>> we allow attach(SegmentReader reader) to return an instance of plugin
>> (plugin decides if it is the same instance always, or per-reader), and
>> populate the map for subreader with results of attach invoked on
>> toplevel reader map, we'll turn this code:
>> segmentReader.plugin(SomeClass.class).segmentReaderDependentMethod(segmentReader);
>> into:
>> segmentReader.plugin(SomeClass.class).segmentReaderDependentMethod();
>> which makes more sense
>>
>> Any way the general idea is still the same.
>>
>> --
>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>> ICQ: 104465785
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message