mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <>
Subject Re: Mahout ML vs Spark Mlib vs Mahout-Spark integreation
Date Wed, 01 Feb 2017 23:32:24 GMT
Isabel, if i understand it correctly, you are asking whether it makes sense
add end2end scenarios based on Samsara to current codebase?

The answer is, absolutely. Yes it does for both rather isolated issues
(like computing clusters) and end-2-end scenarios.

The only problem with end 2 end scenarious is they often difficult to
demonstrate with batch-oriented coputational system only. That's what kind of picked on with COO, they included all of data
ingestion, computation and real time scoring queries.

But yes, there's, absolutely, tons of value in that. Not everything fits
quite nicely, and not everything fits end-2-end (just like with R), but
some fairly significant pieces do fit to be written on top.

> > > perspective? If so, would there be interest among the Mahout
> committers to
> > > help
> > > users publicly create docs/examples/modules to support these use cases?
> > >
> >
> > yes
> Where do we start? ;)

I would start with figuring a problem I want to solve AND I have a budget
to do it AND i can legally contribute on behalf of the IP owner.

Then we can think of whether it is a good fit (Samsara is mostly limited to
tensor based data only, just like Mapreduce DRM was/is). Some things may
not have a convenient algebraic formulation.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message