mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Hall <>
Subject Re: SVM algo, code, etc.
Date Wed, 16 Dec 2009 06:32:23 GMT
On Fri, Dec 11, 2009 at 5:02 AM, Jake Mannix <> wrote:
> I really feel like I should respond to this, but seeing as I live on the
> west coast
> of the US, going to bed might be more advisable.
> On a very specific topic of SVMs, I can certainly look into this, but David,
> were you interested in helping bring this into Mahout and help maintain it?
> You are often rather quiet on here, yet happened to jump in as this topic
> came up?

Yeah, the first semester of the PhD program has been far more busy
than I imagined, and I've been overwhelmed. (Now it's finals week.)

Online optimization has kind of caught my eye of late, with Pegasos
being something I had been thinking about implementing. I would be
glad to get this up and running, though I'd like more to help curate a
patch. Pegasos or no.

-- David

>  -jake
> On Fri, Dec 11, 2009 at 4:40 AM, Sean Owen <> wrote:
>> This is a timely message, since I'm currently presuming to close some
>> old Mahout issues at the moment and it raises a related concern.
>> There's lots of old JIRA issues of the form:
>> 1) somebody submits a patch implementing part of something
>> 2) some comments happen, maybe
>> 3) nothing happens for a year
>> 4) I close it now
>> At an early stage, this is fine actually. 20 people contribute at the
>> start; 3 select themselves naturally as regular contributors. 20
>> patches go up; the 5 that are of use an interest naturally get picked
>> up and eventually committed. But going forward, this probably won't
>> do. Potential committers get discouraged and work goes wasted. (See
>> comments about Commons Math on this list for an example of the
>> fallout.)
>> I wonder what the obstacles are to avoiding this?
>> 1) Do we need to be clearer about what the project is and isn't about?
>> What the priorities are, what work is already on the table to be done?
>> This is why I am keen on cleaning up JIRA now; it's hard for even us
>> to understand what's in progress, what's important,
>> 2) Do we need some more official ownership or responsibility for
>> components? For example I am not sure who would manage changes to
>> Clustering stuff. I know it isn't me; I don't know about that part. So
>> what happens to an incoming patch to clustering? While too much
>> command-and-control isn't possible or desirable in open source, lack
>> of it is harmful too. I don't think the answer is "just let people
>> commit bits and bobs" since it makes the project appear to be a
>> workbench of half-finished jobs, which does a disservice to the
>> components that are polished.
>> I have no reason to believe this SVM patch, should it materialize,
>> would fall through the cracks in this way, but want to ask now how we
>> can just make sure. So, can we answer:
>> 1) Is SVM in scope for Mahout? (I am guessing so.)
>> 2) Who is nominally committing to shepherd the code into the code base
>> and fix bugs and answer questions? (Jake?)
>> I'm not really bothered about this particular patch, but the more
>> general question.

View raw message