mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: [jira] Commented: (MAHOUT-206) Separate and clearly label different SparseVector implementations
Date Tue, 24 Nov 2009 20:31:28 GMT
Yes, I have lived this pain for a long time with Lucene.  Personally, though, a lot of the
pain comes from a fairly strict back compatibility policy that to me isn't always well founded
given the release cycle Lucene usually operates under.  I've always wished there was a @introducing
annotation for interfaces, such that you could tell people what is coming down the pike. 

I also often feel the right answer is a combination of both.  New methods could be added on
a new interface that is then applied to an Abstract class, thus it can be inherited by downstream
implementors.  People who don't inherit from the Abstract can choose to add the new interface
if they see fit.

For now, we don't have any back compat commitments.   I think once we get to 0.9, we can decide
on that.


On Nov 24, 2009, at 3:21 PM, Jake Mannix wrote:

> Oof.
> 
> So you're arguing this as a temporary thing, until our interfaces stabilize?
> It makes
> unit testing much harder this way, but I guess I see the rationale.
> 
> If we do this, we need to leave a lot out of that base class - there may be
> some really
> big differences in implementation of these classes (for example: distributed
> / hdfs
> backed matrices vs locally memory-resident ones), so very very little should
> be
> assumed in the base impl.  I guess more can be done in the vector case,
> however.
> 
>  -jake
> 
> On Tue, Nov 24, 2009 at 12:08 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
> 
>> Yes.  Interfaces are the problem that commons math have boxed themselves in
>> with.  The Hadoop crew (especially Doug C) are adamant about using as few
>> interfaces as possible except as mixin signals and only in cases where the
>> interface really is going to be very, very stable.
>> 
>> Our vector interfaces are definitely not going to be that stable for quite
>> a
>> while.
>> 
>> On Tue, Nov 24, 2009 at 12:03 PM, Jake Mannix <jake.mannix@gmail.com>
>> wrote:
>> 
>>> Well we do use AbstractVector.  Are you suggesting that we *not* have a
>>> Vector interface
>>> at all, and *only* have an abstract base class?  Similarly for Matrix?
>>> 
>>> -jake
>>> 
>>> On Tue, Nov 24, 2009 at 11:57 AM, Ted Dunning <ted.dunning@gmail.com>
>>> wrote:
>>> 
>>>> We should use abstract classes almost everywhere instead of interfaces
>> to
>>>> ease backward compatibility issues with user written extensions to
>>> Vectors
>>>> and Matrices.
>>>> 
>>>> On Tue, Nov 24, 2009 at 9:38 AM, Grant Ingersoll (JIRA) <
>> jira@apache.org
>>>>> wrote:
>>>> 
>>>>> It seems like there is still some commonality between the two
>>>>> implementations (size, cardinality, etc.) that I think it would be
>>>>> worthwhile to keep SparseVector as an abstract class which the other
>>> two
>>>>> extend.
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Ted Dunning, CTO
>>>> DeepDyve
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> Ted Dunning, CTO
>> DeepDyve
>> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
http://www.lucidimagination.com/search


Mime
View raw message