flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Scala <Phil.Sc...@globalrelay.net>
Subject RE: Flume-ng 1.6 reliable setup
Date Thu, 15 Oct 2015 17:59:39 GMT
Hi Simone

I wonder why you’re seeing 90% CPU use when you use a file channel.  I would expect high
disk I/O.  To counter, I have on a single server 4 spool dir sources, each going to a separate
file channel.  Also on an SSD based server.   I do not see any CPU or even disk IO utilization.
 I am pushing about 10 million events per day across all 4 sources and has been running reliably
for 2 years now.

I would always use a file channel, any memory channel runs the risk of data loss if the node
were to fail.  I would be as worried about the local node failing seeing that a 3 node kafka
cluster losing 2 nodes before it would lose quorum.

Not sure what your data source is, if you can add more flume nodes of course that would help.

Have you given ample heap space, seeing maybe GC’s causing the high CPU?


Phil


From: Simone Roselli [mailto:simoneroselli78@gmail.com]
Sent: Friday, October 09, 2015 12:33 AM
To: user@flume.apache.org
Subject: Flume-ng 1.6 reliable setup

Hi,

I'm currently plan to migrate from Flume 0.9 to Flume-ng 1.6, but I'm having troubles trying
to find a reliable setup for this one.

My sink is a 3 nodes Kafka cluster. I must avoid to lose events in case the main sink is down,
broken or unreachable for a while.

In Flume 0.9, I use a memory channel with the store on failure feature, which starts writing
events on the local disk in case the target sink is not available.

In Flume-ng 1.6 the same behaviour would be accomplished by setting up a Spillable memory
channel, but the problem with this solution is written in the end of the channel's description:
"This channel is currently experimental and not recommended for use in production."

In Flume-ng 1.6, it's possible to setup a pool of Failover sinks. So, I was thinking to hypothetically
configure a File Roll as Secondary sink in case the Primary is down. However, once the Primary
sink would be back online, the data placed on the Secondary sink (local disk) won't be automatically
pushed on the Primary one.

Another option would be setting up a file channel: write each event on the disk and then sink.
Without mentioning that I don't love the idea to write/delete each single event continuously
on a SSD, this setup is taking 90% of CPU. The same exactly configuration but using a memory
channel takes 3%.

Other solutions to evaluate ?

Simone

Mime
View raw message