commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benson Margulies <>
Subject Re: [math] getting changes included into commons-math (was Re: Home for the colt fork)
Date Wed, 09 Dec 2009 21:17:31 GMT
I can see how everyone ends up with a headache here.

As the person who threw the most recent rock into the lake, let me
re-present the situation as I got into it.

CERN Colt is a library with a mixture of 'category A' material and
'category B-or-worse' material. In other words, it is not an
attractive dependency for ASF code as a lump.

Part of the 'category A' code is a set of very pretty associative
containers for primitive types. My goal is to take this code and
modernize it to use Generics as appropriate -- and otherwise make it a
replacement for the LGPL Trove library.

The associative code uses some math code. I'm pretty sure that all of
the math code that it uses is also category A. In fact, I believe that
all of it is in the portion that Mahout forked. This includes some
things that aren't in commons-math at all.

So, we have some adoptable code in Colt that overlaps functionality in
-math, some that doesn't, and some that I want to use as the basis for
work in -primitives.

If the messages in this thread mean that the code already in -math is
in fact moving in a direction to support mahout, then it might be
acceptable for the additional, non-overlapping, Colt math code to end
up in -math, and then everyone ends up happy?

Another question: OK, -math is a stable, high-compatibility library.
Mahout, on the other hand, wants to be a fast-moving, somewhat fluid,
build of code (naturally including some math code) adapted for
map-reduce. So, how about a branch of math that could release
frequently with a new major version number? Obviously, from a work
standpoint, this would depend on some of the Mahouts achieving
committer status in commons.

On Wed, Dec 9, 2009 at 3:46 PM, Luc Maisonobe <> wrote:
> Jake Mannix a écrit :
>> On Wed, Dec 9, 2009 at 11:09 AM, Benson Margulies <>wrote:
>>> This is interesting. We have a raft of mathematically qualified
>>> committers on Mahout, and this message asking for help on
>>> commons-math, and a raft of code marooned at mahout that wants to be
>>> in commons math. If I were one of those mathematically competant
>>> individuals, I'd be off attaching a patch or three to a JIRA or two
>> The commons-math linear APIs have been described as effectively locked
>> until 3.0, due to back-compat requirements.  This means that any code
>> contributed
>> into c-math would live in a parallel (no pun intended) to the linear
>> primitives which
>> exist already in there.
>> Adopting something like MTJ or Colt in Mahout turned out to be easier,
>> because
>> we are on release 0.2 (heading for 0.3 now), and have less stringent
>> back-compat
>> requirements, so we are overhauling our linear apis (read: even user-facing
>> interface changes) to take advantage of useful parts of Colt, and are
>> planning on
>> using our Colt fork as the underlying implementation.
>> Commons-math expressed that changing linear APIs is not something they can
>> do,
>> given the maturity of their library, so where would Colt *go* in c-math?
>> It's own
>> submodule, having its own eigendecompositions and svd and so forth, running
>> parallel to the current c-math impls?  Why?
>> Who would maintain it and write tests for it, and how do you explain to
>> end-users which they should use?
>>> On Wed, Dec 9, 2009 at 1:48 PM, Luc Maisonobe <>
>>> wrote:
>>>> Ted Dunning a écrit :
>>>>> Actually, the reason that we have Colt in Mahout is it has proven
>>> impossible
>>>>> to get changes into commons math.  We really, really wanted to use
>>> commons
>>>>> math rather than have our own linear algebra package, but it just proved
>>>>> impossible and we didn't want to wait forever.
>>>> If you really, really wants to use commons math and want changes to be
>>>> included, contribute them.
>> I have submitted patches for the following tickets: MATH-312 (and acceptance
>> of that patch blocks my patch for MATH-314), MATH-316 and MATH-317, none
>> of which have appear to have had much progress on.  All of my patches come
>> with unit tests for new functionality.
> I had these patches in my backlog and considered them accepted. I should
> have commited them before, sorry for that. I'll take care of them right now.
>> On the other hand, when I opened the discussion about extending the
>> functions
>> package to enable composable functions (MATH-313), I got an entirely hostile
>> response, which only tempered as far as "+0" on adding it after discussion.
> The discussion was not entirely hostile as we get some intermediate
> consensus at some points. I understand your feelings after several
> patches that did not get committed fast enough.
> Please accept my apologizes for this.
> Luc
>> In particular, my first step at making commons-math something Mahout could
>> standardize on for linear work was MATH-312, which I did submit a patch for,
>> and revised it many times after discussion about what is acceptable practice
>> in c-math.  Not yet applied, months later.  It's probably far out of date
>> now...
>> Similarly, when I tried to ask what the status on decisions on whether to
>> adopt
>> MTJ or Colt, the statement by Phil was basically that commons-math would not
>> adopt anything which had any external dependencies or
>> not-easily-human-readable java source (which ruled out MTJ because of f2j
>> produced code), and which had to be fully tested and maintained prior to
>> adoption (which rules out Colt which has no unit tests yet).
>> Ted and I weren't making "requests" for other people to do work, we were
>> wondering whether even offers to do some of the work would be accepted,
>> and for many of the questions/suggestions we had, it seems the desires
>> and requirements of the Mahout community were incompatible with those
>> of commons-math.
>>   -jake
>>>  > I think the only change that was proposed and not done because of lack
>>>> of consensus was the inclusion of MTJ (and I don't consider the
>>>> discussion closed on that topic either, so it may still happen some
>>>> day). All the other changes that are desired are simply lacking someone
>>>> to do the work. There were proposals to extend the linear algebra API,
>>>> proposals to add more support for sparse matrices, proposals to get
>>>> partial decomposition ... But sparse contributions (pun intended).
>>>> I try to do what I can, but as you have probably seen have been rather
>>>> silent since 2.0 release. For my part, I really, really need help. I
>>>> would like to fix the problems in the eigen decomposition and SVD but
>>>> need a good kick to get on it, and having only requests and no help is
>>>> not really motivating.
>>>> Luc
>>>>> If that problem were solved, then it would be great to depend on commons
>>>>> math.  If that problem isn't solved, then there is no way to depend
>>>>> commons math.
>>>>> On Tue, Dec 8, 2009 at 6:19 AM, Benson Margulies <
>>>> wrote:
>>>>>> We can't possibly have a dependency on Mahout in the long term. Either
>>>>>> we all go shares on code in some other piece of commons, or we end
>>>>>> with two forks, which would be sad.
>>>>>> On Tue, Dec 8, 2009 at 8:33 AM, James Carman <
>>>>>> wrote:
>>>>>>> I wouldn't like to see a dependency on mahout code in a "commons"
>>>>>>> library.  That seems kind of backwards.  If Mahout wants to
>>>>>>> this stuff, we can move it into a library in commons (which is
>>>>>>> typically how stuff used to happen in Jakarta).
>>>>>>> On Tue, Dec 8, 2009 at 8:18 AM, Benson Margulies <
>>>>>> wrote:
>>>>>>>> Mahout now has a fork of a portion of the 'category A' portion
of the
>>>>>>>> CERN colt library forked. The Mahout fork is, of course,
in the
>>> Mahout
>>>>>>>> tree under a Mahout Java package and Maven triple.
>>>>>>>> I want to use the collections classes from Mahout as the
core to a
>>> new
>>>>>>>> set of commons-primitives classes that do the useful things
that GNU
>>>>>>>> Trove does.
>>>>>>>> The classes I want to start from depend on the classes that
are in
>>> the
>>>>>>>> Mahout fork.
>>>>>>>> As a temporary expedient, I can depend on their there. However,
>>>>>>>> submit that it would be more better if the mathematical code
were in
>>>>>>>> commons-math. Was this option explored?
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail:
>>>>>>>> For additional commands, e-mail:
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail:
>>>>>>> For additional commands, e-mail:
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail:
>>>>>> For additional commands, e-mail:
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail:
>>>> For additional commands, e-mail:
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message