flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gonzalo Herreros <gherre...@gmail.com>
Subject Re: Flume benchmarking with HTTP source & File channel
Date Sat, 14 Nov 2015 08:31:35 GMT
If that is just with a single server, 600 messages per sec doesn't sound
bad to me.
Depending on the size of each message, it could be the network the limiting
factor.

I would try with the null sink and in memory channel. If that doesn't
improve things I would say you need more nodes to go beyond that.

Regards,
Gonzalo
On Nov 14, 2015 7:40 AM, "Hemanth Abbina" <HemanthA@eiqnetworks.com> wrote:

> Hi,
>
>
>
> We have been trying to validate & benchmark the Flume performance for our
> production use.
>
>
>
> We have configured Flume to have HTTP source, File channel & Kafka sink.
>
> Hardware : 8 Core, 32 GB RAM, CentOS6.5, Disk - 500 GB HDD.
>
> Flume configuration:
>
> *svcagent.sources =
> http-source
> *
>
> *svcagent.sinks =
> kafka-sink1
> *
>
> *svcagent.channels = file-channel1*
>
>
>
> *# HTTP source to read receive events on port 5005*
>
> *svcagent.sources.http-source.type =
> http                                                              *
>
> *svcagent.sources.http-source.channels =
> file-channel1
>                                                                                     
                                                                                         
             *
>
> *svcagent.sources.http-source.port =
> 5005                                                              *
>
> *svcagent.sources.http-source.bind =
> 10.15.1.31                                                        *
>
>
>
>
> *svcagent.sources.http-source.selector.type =
> multiplexing                                             *
>
> *svcagent.sources.http-source.selector.header =
> archival                                               *
>
> *svcagent.sources.http-source.selector.mapping.true =
> file-channel1                                    *
>
> *svcagent.sources.http-source.selector.default =
> file-channel1                                         *
>
> *#svcagent.sources.http-source.handler
> =org.eiq.flume.JSONHandler.HTTPSourceJSONHandler                *
>
>
>
>
> *svcagent.sinks.kafka-sink1.topic =
> flume-sink1                                                       *
>
> *svcagent.sinks.kafka-sink1.brokerList = 10.15.1.32:9092
> <http://10.15.1.32:9092>                                              *
>
> *svcagent.sinks.kafka-sink1.channel =
> file-channel1                                                   *
>
> *svcagent.sinks.kafka-sink1.batchSize =
> 5000
>     *
>
>
>
>
> *svcagent.channels.file-channel1.type =
> file                                                           *
>
> *svcagent.channels.file-channel1.checkpointDir=/etc/flume-kafka/checkpoint
> *
>
> *svcagent.channels.file-channel1.dataDirs=/etc/flume-kafka/data
> *
>
> *svcagent.channels.file-channel1.transactionCapacity=10000
>        *
>
> *svcagent.channels.file-channel1.capacity=50000
> *
>
> *svcagent.channels.file-channel1.checkpointInterval=120000
> *
>
> *svcagent.channels.file-channel1.checkpointOnClose=true
> *
>
> *svcagent.channels.file-channel1.maxFileSize=536870912
> *
>
> *svcagent.channels.file-channel1.use-fast-replay=false
>           *
>
>
>
> When we tried to stream HTTP data, from multiple clients (around 40 HTTP
> clients), we could get a max processing of 600  requests/sec, and not
> beyond that. Increased the XMX setting of Flume to 4096.
>
>
>
> Even we have tried with a Null Sink (instead of Kafka sink). Did not get
> much performance improvements. So, assuming the blockage is the HTTP source
> & File channel.
>
>
>
> Could you please suggest any fine tunings to improve the performance of
> this setup.
>
>
>
>
> --regards
>
> Hemanth
>

Mime
View raw message