hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shahab Yunus <shahab.yu...@gmail.com>
Subject Re: The future of MapReduce
Date Wed, 02 Jul 2014 20:12:06 GMT
My personal thoughts on this.

I approach this problem in a different way. Map/Reduce is not a framework
or a technology. It was a paradigm for distributed and parallel processing
which can be implemented in different frameworks and style. So given that,
I don't think there is as such any harm in learning this paradigm as it
does not bind you too any specific framework or tool. What I mean to say is
that it is a general concept. You can pick any implementation and explore.

Continuing on that, given the major break-through it was in terms of Big
Data world, even if it is eventually phased-out, to understand it and look
into it is still very beneficial and can help in increasing better
understanding of this relatively new field of Big Data and build solid
foundation. Given it's seminal nature and scope, I would not consider it a
waste of time.

Lastly, it is even more personal way of looking at this problem: if one
approaches M/R as a more generic concept rather than a tool than it is not
that difficult or time-consuming to learn or understand it.

On the point of whether Google is the *only* one dealing with seriously
large amounts of data, I would not only say that Facebook will catch-up
pretty soon but one should take a look at this interview by Chris Mattmann
from NASA :)

"...on Big Data Infrastructure for Scientific Data Processing"


On Tue, Jul 1, 2014 at 5:43 PM, snehil wakchaure <snehil.w@gmail.com> wrote:

> Heard about Google dataflow from last week
> On Jul 1, 2014 4:42 PM, "Marco Shaw" <marco.shaw@gmail.com> wrote:
>> Interesting timing:
>> http://java.dzone.com/articles/there-future-mapreduce
>> Google declared last week that "MapReduce was dead" more or less, but
>> there are very few that process data at Google's level.
>> Makes me wonder what Yahoo has for a tech mix these days...
>> On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>>   It was a declarative statement designed to elicit further explanation.
>>> If someone is brand new and trying to figure out how to eat the elephant
>>> as it were, you kind of want to burn things down to their essentials. If
>>> MapReduce isn’t going to be part of the ecosystem in the future, one does
>>> not want to spend hours learning how to write MapReduce jobs.
>>> B.
>>>  *From:* Marco Shaw <marco.shaw@gmail.com>
>>> *Sent:* Tuesday, July 01, 2014 3:50 PM
>>> *To:* user <user@hadoop.apache.org>
>>> *Subject:* Re: The future of MapReduce
>>>  Sorry, not sure if that's a question.
>>> Hadoop v1=HDFS+MapReduce
>>> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
>>> optional to "get work done")
>>> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
>>> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
>>> requirements, which may actually be both batch "stuff" and/or real-time.
>>> Not sure if that clarifies things...  Just like you can evaluate all
>>> kinds of Apache ecosystems products to meet your needs, MapReduce is no
>>> longer the only kid on the bock.
>>> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
>>> adaryl.wakefield@hotmail.com> wrote:
>>>>   From your answer, it sounds like you need to be able to do both.
>>>>  *From:* Marco Shaw <marco.shaw@gmail.com>
>>>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>>>> *To:* user <user@hadoop.apache.org>
>>>> *Subject:* Re: The future of MapReduce
>>>>  It depends...  It seems most are evolving from needing "lots of data
>>>> crunched", to "lots of data crunched right now".  Most are looking for
>>>> *real-time* fraud detection or recommendations, for example, which
>>>> MapReduce is not ideal for.
>>>> Marco
>>>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>>>> adaryl.wakefield@hotmail.com> wrote:
>>>>>   “The Mahout community decided to move its codebase onto modern data
>>>>> processing systems that offer a richer programming model and more efficient
>>>>> execution than Hadoop MapReduce.”
>>>>> Does this mean that learning MapReduce is a waste of time? Is Storm
>>>>> the future or are both technologies necessary?
>>>>> B.

View raw message