mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject Re: Is Mahout obsolete now?
Date Mon, 19 Oct 2015 22:29:46 GMT
BTW this use of Mahout-Samsara on Spark for recs has really expanded. The Samsara part I’m
calling a Correlation Engine, it can be used to mix usage, content, and context to make recs.
I look back on 2 years ago as pretty much groping around for solutions. Things are much clearer
now (for me at least)

Check out some slides about the math, leading to the “whole enchilada” equation. Ted Dunning,
Sean Owen, and Sebastian Schelter get no small credit.
http://www.slideshare.net/pferrel/unified-recommender-39986309

Even have code running using the PredicitonIO framework. This includesa SDK to event store
to realtime query. Loosely speaking a lambda architecture. Most of the whole enchilada running
except the content part of the equation, which only works on metadata for how.
https://github.com/pferrel/scala-parallel-universal-recommendation

We even do custom versions at actionML.com


On Oct 19, 2015, at 6:42 AM, Sean Owen <srowen@gmail.com> wrote:

No, this is pretty wrong. Spark is not, in general, a real-time
anything. Spark Streaming is a near-real-time streaming framework, but
it is not something you can build models with. Spark MLlib / ML are
offline / batch. Not sure what you mean by Hadoop engine, but Spark
does not build on MapReduce, if that's what you mean.

The "classic" Mahout code (<= 0.9) is definitely deprecated. The "new"
Mahout is not. It has a fairly different new recommender system called
Samsara. It has Scala APIs. In fact, it uses Spark. I think you're
somehow talking about the "classic" Mahout code here only.

On Mon, Oct 19, 2015 at 2:38 PM, Fei Shan <shanfeishanfei@gmail.com> wrote:
> Spark is a in memory , near realtime Machine Learning frameowork , has
> scala and java interface
> Mahout is an offline Machine Learning framework, no scala apis
> 
> they both built on the HDFS and Hadoop engine
> 
> Spark has an ecosystem like Hadoop
> Mahout is part of of Hadoop ecosystem
> 
> Spark could beat Mahout on processing speed
> and concise programming APIs
> 
> for online data anaysis , Spark is a better choice.
> for offline data analysis, both fits well.
> 
> 
> 
> On Mon, Oct 19, 2015 at 9:14 PM, Prasad Priyadarshana Fernando <
> bppf16@gmail.com> wrote:
> 
>> Hi,
>> 
>> If I have used Mahout for my recommendation application, should I migrate
>> into Spark MLib technology? Is the mahout still supported and migrated?
>> 
>> Thanks
>> 
>> *Prasad Priyadarshana Fernando <http://www.linkedin.com/in/prasadfernando
>>> *
>> Mobile: +1 330 283 5827
>> 


Mime
View raw message