flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arvind Prabhakar <arv...@apache.org>
Subject Re: New production setup
Date Fri, 18 May 2012 15:54:02 GMT
Hi Simon,

The wiki page is dated to say the least. At the moment there are many
active deployments of Flume NG that are in staging if not production. I
encourage you to look at the performance numbers that were recently
published on the wiki [1].

The usecase you have described seems something that Flume should be able to
handle very easily. I encourage you to look at the log4j appender,
Memory/File channels and the HDFS event sink. Of course you could plan on
using other components as well if this does not fit well with your
application.

[1]
https://cwiki.apache.org/confluence/display/FLUME/Flume+NG+Performance+Measurements

Thanks,
Arvind Prabhakar

On Fri, May 18, 2012 at 4:58 AM, Simon Kelly <simongdkelly@gmail.com> wrote:

> Hi
>
> I'm interested in using Flume to store audit logs in HDFS which can then
> be queried with Hive. I see that the links on the Flume page point to Flume
> NG which says its not ready for production use yet. Is that still the case?
>
> Our use case would likely look something like this:
>
>    - 15 servers running a Java web server and logging audit data (1-2K
>    per event, 20-90 events per second per server)
>    - Hadoop running on 5 machine cluster (4x2.4GHz processors, 8GB RAM,
>    8TB total storage)
>
> Its important that all data makes it into HDFS.
>
> I'd appreciate any comments on how to proceed with this.
>
> Best regards
> Simon Kelly
>

Mime
View raw message