hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Zaliva <kroko...@gmail.com>
Subject Re: Using Hadoop for near real-time processing of log data
Date Wed, 25 Feb 2009 17:39:18 GMT
On Wed, Feb 25, 2009 at 05:59, Ryan LeCompte <lecompte@gmail.com> wrote:
> Hello all,
> Is anyone using Hadoop as more of a near/almost real-time processing
> of log data for their systems to aggregate stats, etc? I know that
> Hadoop has generally been good at off-line processing of large amounts
> of data, but I've wondered if anyone has tried using it for processing
> of near real-time log data as it is appears in your systems with any
> success? My gut feeling is that Hadoop isn't suitable for this yet
> given redundancy issues around the JobTracker/NameNode, as well as the
> overhead of moving blocks around in HDFS. Thoughts?


Several people (myself including) asked similar question. You may want
to search the mailing list archives for previous discussions on the

In short, you are right, Hadoop is not perfecltly suited for realtime


View raw message