lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <>
Subject Re: Fieldable, AbstractField, Field
Date Wed, 19 Mar 2008 18:17:52 GMT
Probably going to disagree here... but that's ok !

I think IndexReader and IndexWriter would have been perfect  
interfaces - as long as the concepts were kept very abstract.

putDocument(), getDocument(), findDocument(), etc.

and supported the semantics.

That is what I find is key to hiding the implementation. If you start  
with very abstract concepts, the impl does not creep into the API.

When "things change" even with abstract classes, you still have a  
problem of sorts. Take locking for instance. If you never had locking  
in a system, and add it later, even though you could add methods like  
'lockIndex()' and 'unlockIndex()' other clients that worked well  
before if they were not updated to call these methods properly would  
either fail, or lock the index from everyone else while they were  
open. Probably not good in either case. So adding concepts usually  
requires code changes anyway.

In the example you cite, there are two possibilities: 1) a MapContext  
can ALWAYS be created from a map,value,reporter. If this is the case,  
your API change is for programmer convenience only and can easily be  
performed in other ways. 2) this is not the case, then internally  
your code has two paths anyway, with branching logic based on the  
type of key

I think your statement 'static problem domains' is a bit incorrect.  
It depends on what you define as the problem. Going back to the  
TableModel interface, its problem domain could be stated as  
'interface to tabular data for display and edit', and makes no  
assumption on how the data is stored, what the data is, etc.

Taken to Lucene, IndexWriter could be defined as 'place document in  
index suitable for later retrieval'.

Yes you need to know a bit more about the available operations, but  
if kept abstract enough, the interfaces follow quite easily, AND they  
don't paint you into a corner.

On Mar 19, 2008, at 1:01 PM, Doug Cutting wrote:

> robert engels wrote:
>> The problem with abstract classes, is that any methods you provide  
>> "know" something of the implementation, unless the methods are  
>> implemented solely by calling other abstract methods (which is  
>> rarely the case if the abstract class contains ANY private members).
> Yes, abstract classes should generally avoid private fields that  
> don't have both setters and getters.
>> This is possible because the interfaces were designed very well.  
>> You MUST completely understand the problem domain in abstract  
>> terms in order to define proper interfaces.
> That works for static problem domains.  If Lucene is resolved to  
> only make bugfix releases, and not to substantially evolve its  
> feature set, then this might be appropriate.
>> IndexReader and IndexWriter should have been interfaces. If they  
>> were, lots of the code would not have been structured as it was,  
>> and many problems people had in producing "other" implementations  
>> could have been avoided.
> The problem is not that they were not interfaces, but that they  
> were not originally intended to be abstract and replaceable.  The  
> original design was that indexing would be the primary  
> implementation that Lucene provided, and that things around  
> indexing would be extensible, but that indexing itself would not  
> be.  Extensibility was retrofitted onto an existing design, and it  
> still shows some.
> If IndexReader and IndexWriter were originally written to be  
> extensible it would have been foolish to implement them as  
> interfaces given the amount that these have evolved.  Each release  
> would have broken every application.
>> As for future expansion, it is improbable in most cases that  
>> adding new abstract methods will work - if that is the case, they  
>> can easily be added to a static utility class. If the API is  
>> really changing/adding, it is easy to create 'interfaceV2 extends  
>> interfaceV1'. If the code worked before, and you want to support  
>> backwards code compatibility between versions, this is a fool  
>> proof way to accomplish it.
> This is not foolproof.  Not all extension is the addition of new  
> methods.  In Hadoop, for example, we wish to move from Mapper#map 
> (key, value, reporter) to Mapper#map(MapContext), where MapContext  
> has getKey(), getValue(), getReporter() and other methods.  If  
> Mapper were an abstract class, back-compatibility would be easy,  
> since we could provide a default implementation of Mapper#map 
> (MapContext) that calls Mapper#map(key, value, reporter).  With  
> interfaces things are much more complicated, since, for back- 
> compatibility, we must support both versions of the interface for a  
> time, dynamically determining what version of the interface the  
> application has specified and calling it accordingly.  This is ugly  
> code that we could have avoided if we'd stuck to abstract classes.   
> And the impact is not only where the Mapper is run, but also where  
> it is specfied (JobConf).  So instead of localizing the change to  
>, we have to add lots of runtime support and public API  
> methods in other classes.  Yuck.
> On the other hand, Hadoop's FileSystem is an abstract class.  It  
> has evolved considerably and applications have been able to upgrade  
> without pain.  Lucene's Directory has also evolved profitably  
> without breaking external Directory implementations.
> Interfaces look elegant, but looks can deceive.
> Doug
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message