mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Cloudera announces Oryx
Date Tue, 12 Nov 2013 13:46:08 GMT
I think I'm the biggest single contributor to Mahout over time (? was
at one point), and so by extension Cloudera is. And this new project
is all open source. Surely that's maximally "walking the walk" in
these regards?

Mahout has served well for a long time as measured in Hadoop-years --
like 4+ years. It's still in usable life. I don't think the current
state of the code means it's feasible to truly evolve it towards
things like Hadoop 2, Spark, real-time. That is to say, there are
legitimate reasons to start forward from a new project with different
goals.

CDH5 still supports Mahout for sure. Oryx will work on any Hadoop (2)
distro. I hope there is no openness foul here.



On Tue, Nov 12, 2013 at 1:04 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
> Sean writes:
>
> We release Oryx today -- get some.
> #cloudera<https://plus.google.com/s/%23cloudera>
>>  #oryx <https://plus.google.com/s/%23oryx>
>> The Oryx open source project provides simple, real-time large-scale
>> machine learning infrastructure. It implements a few classes of algorithm
>> commonly used in business applications: collaborative filtering /
>> recommendation, classification / regression, and clustering. It can
>> continuously build models from a stream of data at large scale using Apache
>> Hadoop's MapReduce. It also serves queries of those models in real-time via
>> an HTTP REST API, and can update models approximately in response to new
>> data. Models are exchanged in PMML format.
>
>
>
> I personally find it a pity that Cloudera talks the open source talk, but
> doesn't walk the walk by contributing to, for example, Mahout.
>
> Their decision.
>
> Sean's decision as well, I guess.

Mime
View raw message