hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peyman Mohajerian <mohaj...@gmail.com>
Subject Re: The future of MapReduce
Date Wed, 16 Jul 2014 16:36:21 GMT
This statement is inaccurate. Not all machine learning involves iterative
computation, not all dataset can fit in-memory. I'm not an expert in
Machine Learning, but I know enough to know that talking about it in some
generic sense from a standpoint of spark vs mahout, or R vs Python makes no
sense. Many Machine Learning algorithms involves creating models from
massive amount of data and in no context it would make sense to do it
Also people do map/reduce in-memory, Shahab elaborated on that nicely later
on the same thread.

On Tue, Jul 1, 2014 at 2:17 PM, kartik saxena <kartik.sxn@gmail.com> wrote:

> Spark https://spark.apache.org/ is also getting a lot attention with its
> in-memory computations and caching features. Performance wise it is being
> touted better than mahout because machine learning involves iterative
> computations and Spark could cache these computations in-memory for faster
> processing.
> On Tue, Jul 1, 2014 at 11:07 AM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>>   From your answer, it sounds like you need to be able to do both.
>>  *From:* Marco Shaw <marco.shaw@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>> *To:* user <user@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>  It depends...  It seems most are evolving from needing "lots of data
>> crunched", to "lots of data crunched right now".  Most are looking for
>> *real-time* fraud detection or recommendations, for example, which
>> MapReduce is not ideal for.
>> Marco
>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>>   “The Mahout community decided to move its codebase onto modern data
>>> processing systems that offer a richer programming model and more efficient
>>> execution than Hadoop MapReduce.”
>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>> future or are both technologies necessary?
>>> B.

View raw message