hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peyman Mohajerian <mohaj...@gmail.com>
Subject Re: The future of MapReduce
Date Wed, 16 Jul 2014 16:36:21 GMT
This statement is inaccurate. Not all machine learning involves iterative
computation, not all dataset can fit in-memory. I'm not an expert in
Machine Learning, but I know enough to know that talking about it in some
generic sense from a standpoint of spark vs mahout, or R vs Python makes no
sense. Many Machine Learning algorithms involves creating models from
massive amount of data and in no context it would make sense to do it
in-memory.
Also people do map/reduce in-memory, Shahab elaborated on that nicely later
on the same thread.


On Tue, Jul 1, 2014 at 2:17 PM, kartik saxena <kartik.sxn@gmail.com> wrote:

> Spark https://spark.apache.org/ is also getting a lot attention with its
> in-memory computations and caching features. Performance wise it is being
> touted better than mahout because machine learning involves iterative
> computations and Spark could cache these computations in-memory for faster
> processing.
>
>
> On Tue, Jul 1, 2014 at 11:07 AM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   From your answer, it sounds like you need to be able to do both.
>>
>>  *From:* Marco Shaw <marco.shaw@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>> *To:* user <user@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  It depends...  It seems most are evolving from needing "lots of data
>> crunched", to "lots of data crunched right now".  Most are looking for
>> *real-time* fraud detection or recommendations, for example, which
>> MapReduce is not ideal for.
>>
>> Marco
>>
>>
>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   “The Mahout community decided to move its codebase onto modern data
>>> processing systems that offer a richer programming model and more efficient
>>> execution than Hadoop MapReduce.”
>>>
>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>> future or are both technologies necessary?
>>>
>>> B.
>>>
>>
>>
>
>

Mime
View raw message