lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: Lucene 2.9 and deprecated methods
Date Mon, 05 Oct 2009 12:51:54 GMT
Michael McCandless wrote:
> I think AS is overkill for conveying configuration of IW/IR?
> Suddenly, instead of:
>   cfg.setRAMBufferSizeMB(128.0)
> I'd have to do something like this?
>   cfg.addAttribute(IndexWriterRAMBufferSizeAttribute.class).setRAMBufferSize(128.0)
> It's too cumbersome, I think, for something that ought to be simple.
> I'd prefer a dedicated config class with strongly typed setters
> exposed.  Of all the "pure syntax" options so far I'd still prefer the
> traditional "config object with setters".

> Also, I don't think we should roll this out for all Lucene classes.  I
> think most classes do just fine accepting args to their ctor.  EG
> TermQuery simply takes Term to its ctor.
> I do agree IW should not be in the business of brokering changes to
> the settings of its sub-components (eg mergeFactor, maxMergeDocs).
> You really should make such changes directly via your merge policy.
Agreed we need to deal with - *but* I personally think it gets tricky.
Users should be able to flip compound on/off easily without dealing with
a mergepolicy IMO. And advanced users that set a mergepolicy shouldn't
have to deal with losing a compound setting they set with IW after
setting a new mergepolicy. Can't I have it both ways :)
> Finally, I'm not convinced we should lock down all settings after
> classes are created.  (I'm not convinced we shouldn't, either).
> A merge policy has no trouble changing its mergeFactor,
> maxMergeDocs/Size.  IW has no trouble changing the its RAM buffer
> size, maxFieldLength, or useCompoundFile.  Sure there are some things
> that cannot (or would be very tricky to) change, eg deletion policy.
> But then analyzer isn't changeable today, but could be.
> But, then, I can also see it'd simplify our code to not have to deal
> w/ such changes, reduce chance of subtle bugs, and it seems minor to
> go and re-open your IndexWriter if you need to make a settings change?
> (Hmm except in an NRT setting, because the reader pool would be reset;
> really we need to get the reader pool separated from the IW instance).
> Mike
> On Mon, Oct 5, 2009 at 4:38 AM, Uwe Schindler <> wrote:
>>>> See my second mail. The recently introduced Attributes and
>>> AttributeSource
>>>> would solve this. Each component just defines its attribute interface
>>> and
>>>> impl class and you pass in an AttributeSource as configuration. Then you
>>> can
>>>> do:
>>>> AttributeSource cfg = new AttributeSource();
>>>> ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class);
>>>> compCfg.setMergeScheduler(FooScheduler.class);
>>>> MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class);
>>>> mergeCfg.setWhateverProp(1234);
>>>> ...
>>>> IndexWriter iw = new IndexWriter(dir, cfg);
>>>> (this is just brainstorming not yet thoroughly thought about).
>>> This approach suggests IW creates its components, and while doing so
>>> provides them your AS instance.
>>> I personally prefer creating all these components myself, configuring
>>> them (at the moment of creation) and passing them to IW in one way or
>>> another.
>>> This requires way less code, you don't have to invent elaborate
>>> schemes of passing through your custom per-component settings and
>>> selecting which exact component types IW should use, you don't risk
>>> construct/postConstruct/postpostConstruct-style things.
>> Not really. That was just brainstorming. But you can pass also instances
>> instead of class names through attributesource. AttributeSurce only provides
>> type safety for the various configuration settings (which are interfaces).
>> But you could also create an attribute that gets the pointer to the
>> component. So "compCfg.setMergeScheduler(FooScheduler.class);" could also be
>> compConfig.addComponent(new FooScheduler(...));
>> The AttributeSource approach has one other good thing:
>> If you want to use the default settings for one attribute, you do not have
>> to add it to the AS (or you can forget it). With the properties approach,
>> you have to hardcode the parameter defaults and validation everywhere. As
>> the consumer of an AttributeSource gets the attribute also by an
>> addAttribute-call (see current indexing code consuming TokenStreams), this
>> call would add the missing attribute with its default settings defined by
>> the implementation class. So in the above example, if you do not want to
>> provide the "whateverProp", leave the whole MergeBarAttribute out. The
>> consumer (IW) would just call addAttribute(MergeBarAttribute.class), because
>> it needs the attribute to configure itself. AS would add this attribute with
>> default settings.
>> Uwe
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

- Mark

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message