mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <>
Subject Re: Helping out on spark efforts
Date Wed, 30 Apr 2014 18:40:05 GMT
On Wed, Apr 30, 2014 at 10:53 AM, Dmitriy Lyubimov <>wrote:

> +1.
> And the greatest benefit of data frames work is standardization of feature
> extraction in Mahout, not necessarily any particular algorithms. This has
> been the thorniest issue in the history and nobody does it well today as it
> stands.

Correction: nobody does it well in open source and in distributed way, that

> If we tackle feature prep techniques in engine-agnostic way, this would be
> truly unique differentiation factor for Mahout.
> On Wed, Apr 30, 2014 at 7:52 AM, Sebastian Schelter <>wrote:
>> I think you should concentrate on MAHOUT-1490, that is a highly important
>> task that will be the foundation for a lot of stuff to be built on top.
>> Let's focus on getting this thing right and then move on to other things.
>> --sebastian
>> On 04/30/2014 04:44 PM, Saikat Kanjilal wrote:
>>> Sebastien/Dmitry,In looking through the current list of issues I didnt
>>> see other algorithms in mahout that are talked about being ported to spark,
>>> I was wondering if there's any interest/need in porting or writing things
>>> like LR/KMeans/SVM to use spark, I'd like to help out in this area while
>>> working on 1490.  Also are we planning to port the distributed versions of
>>> taste to use spark as well at some point.
>>> Thanks in advance.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message