hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Dechoux <decho...@gmail.com>
Subject Re: Hadoop Real time help
Date Mon, 20 Aug 2012 07:37:19 GMT
The terms are
* ESP : http://en.wikipedia.org/wiki/Event_stream_processing
* CEP : http://en.wikipedia.org/wiki/Complex_event_processing

By the way, processing streams in real time tends toward being a pleonasm.

MapReduce follows a batch architecture. You keep data until a given time.
You then process everything. And at the end you provide all the results.
Stream processing has by definition a more 'smooth' throughput. Each event
is processed at a time and potentially each processing could lead to a

I don't know any complete overview of such tools.
Esper is well known in that space.
FlumeBase was an attempt to do something similar (as far as I can tell).
It shows how an ESP engine fits with log collection using a tool such as

Then you also have other solutions which will allow you to scale such as
A few people have already considered using Storm for scalability and Esper
to do the real computation.



On Sun, Aug 19, 2012 at 9:44 PM, Niels Basjes <niels@basj.es> wrote:

> Is there a "complete" overview of the tools that allow processing streams
> of data in realtime?
> Or even better; what are the terms to google for?
> --
> Met vriendelijke groet,
> Niels Basjes
> (Verstuurd vanaf mobiel )
> Op 19 aug. 2012 18:22 schreef "Bertrand Dechoux" <dechouxb@gmail.com> het
> volgende:
> That's a good question. More and more people are talking about Hadoop Real
>> Time.
>> One key aspect of this question is whether we are talking about MapReduce
>> or not.
>> MapReduce greatly improves the response time of any data intensive jobs
>> but it is still a batch framework with a noticeable latency.
>> There is multiple ways to improve the latency :
>> * ESP/CEP solutions (like Esper, FlumeBase, ...)
>> * Big Table clones (like HBase ...)
>> * YARN with a non MapReduce application
>> * ...
>> But it will really depend on the context and the definition of 'real
>> time'.
>> Regards
>> Bertrand
>> On Sun, Aug 19, 2012 at 5:44 PM, mahout user <mahoutuser@gmail.com>wrote:
>>> Hello folks,
>>>    I am new to hadoop, I just want to get information that how hadoop
>>> framework is usefull for real time service.?can any one explain me..?
>>> Thanks.
>> --
>> Bertrand Dechoux

Bertrand Dechoux

View raw message