flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Romero, Miguel" <miguel.romero-remi...@hp.com>
Subject Losing Events | SyslogTCP and Syslog Multi Port
Date Thu, 05 Mar 2015 09:42:49 GMT
Hi all,

I am testing with SyslogSourceTCP and MultiSyslogSourceTCP.
My problem is that the Source loses syslog-events but depends on the tools which sent us syslog.

•             If I use a tool like QRadar the Source loses events (No exceptions, ni trace
 in log about the lost)
•             If I use a adhoc software (log4j appender o cat | nc command or …), there
aren’t lost events.

I have configured  flume of a lot of ways, but I always have the same result.  Have to be
the syslog client of a special way?

Thanks.
https://cwiki.apache.org/confluence/display/FLUME/Flume+NG+Performance+Measurements

From: Smaine Kahlouch [mailto:smaine.kahlouch@smartjog.com]
Sent: jueves, 05 de marzo de 2015 10:11
To: user@flume.apache.org
Subject: Re: Syslog TCP performances issue with filechannel

Actually the batchSize is configured on sink level.
I didn't find this option on file channel.

Furthermore, the source batchSize can't be configured because it is a syslog-ng tool which
doesn't have this capability.
I tried with "netcat" source and i face the same behaviour.

I guess you're right, for each event there's a fsync which causes the heavy load on diks.
However i've read this topic : https://cwiki.apache.org/confluence/display/FLUME/Flume+NG+Performance+Measurements

And they didn't have the same problem obviously.

Regards,


--

Smaine Kahlouch - Engineer, Research & Engineering

Arkena | T: +33 1 5868 6196

27 Blvd Hippolyte Marquès, 94200 Ivry-sur-Seine, France

arkena.com

On 03/04/15 20:08, Hari Shreedharan wrote:
You should probably increase the batch size, since each batch causes an fsync which slows
things down.

Thanks,
Hari


On Wed, Mar 4, 2015 at 6:28 AM, Smaine Kahlouch <smaine.kahlouch@smartjog.com<mailto:smaine.kahlouch@smartjog.com>>
wrote:
Hi all,

I'm currently doing benchmarks on flume.
We're planning to use flume with syslogtcp as source and filechannel in order to have avoid
data loss.

The performances are quiet good when a memorychannel is used :
~100 000events/sec (event size = 600bytes)

But as soon as i switch to filechannel the performances drop drammatically:
~300events/sec

Despite this poor result, the behaviour is really strange because i have a heavy disk usage
(all the disks), near 100%.

I use a tool provided by syslog-ng in order to generate syslog logs : loggen<http://www.balabit.com/sites/default/files/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/loggen.1.html>

ex : loggen -i -I 3000000 --size 600 --active-connections 200 myflumehost 20515


Flume version : 1.5.2
Operating System : Centos 6

Please find my flume configuration enclosed. The filechannel is spread over 5 disks in order
to improve performance.


Could you please help me to configure properly syslogtcp source with filechannel ?

Regards,

--

Smaine Kahlouch - Engineer, Research & Engineering

Arkena | T: +33 1 5868 6196

27 Blvd Hippolyte Marquès, 94200 Ivry-sur-Seine, France

arkena.com
<flume.conf>





Mime
View raw message