flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raymond Ng <raymond...@gmail.com>
Subject performance on RecoverableMemoryChannel vs JdbcChannel
Date Thu, 12 Jul 2012 14:55:31 GMT
Hi

I'm trying to investigate whether I can use flume for streaming syslog data
on a production environemnt, and investigating which channel will give me
durability and also performance

I've tested using memory channel and the performance is good (i.e. with a
1GB JVM, achieving 9000 events / sec, with 1 agent with a syslog source
hopping to another agent which has a hdfs sink)

however durability and recoverability are also important when it comes to
production solution, and it seems both Jdbc and RecoverableMemory channels
offer significantly slow performance (no more than 100 events / sec).  Also
RecoverableMemory channel doesn't seem to resume the streaming after the
agents were restarted

below is my agent configs, could you advice how I can improve the
performance for both jdbc and recoverableMemoery channels, is it possible
to config it to achieve half the performance figure that the memory channel
can achieve?

Agent with Syslog source

agent.sources = SysLogSrc
#agent.channels = MemChannel
#agent.channels = JdbcChannel
agent.channels = RecovMemChannel
agent.sinks = AvroSink

# SysLogSrc
agent.sources.SysLogSrc.type = syslogtcp
agent.sources.SysLogSrc.host = localhost
agent.sources.SysLogSrc.port = 10902
#agent.sources.SysLogSrc.channels = MemChannel
#agent.sources.SysLogSrc.channels = JdbcChannel
agent.sources.SysLogSrc.channels = RecovMemChannel
# MemChannel
agent.channels.MemChannel.type = memory
agent.channels.MemChannel.capacity = 1000000
agent.channels.MemChannel.transactionCapacity = 10000
agent.channels.MemChannel.keep-alive = 3
# JdbcChannel
agent.channels.JdbcChannel.type = jdbc
agent.channels.JdbcChannel.db.type = DERBY
agent.channels.JdbcChannel.driver.class =
org.apache.derby.jdbc.EmbeddedDriver
agent.channels.JdbcChannel.create.schema = true
agent.channels.JdbcChannel.create.index = true
agent.channels.JdbcChannel.create.foreignkey = true
agent.channels.JdbcChannel.maximum.connections = 10
agent.channels.JdbcChannel.maximum.capacity = 0
agent.channels.JdbcChannel.sysprop.user.home = /flume/data
# RecovMemChannel
agent.channels.RecovMemChannel.type =
org.apache.flume.channel.recoverable.memory.RecoverableMemoryChannel
agent.channels.RecovMemChannel.wal.dataDir =
/flume/recoverable-memory-channel
agent.channels.RecovMemChannel.wal.rollSize = 104857600
agent.channels.RecovMemChannel.wal.minRetentionPeriod = 3600000
agent.channels.RecovMemChannel.wal.workerInterval = 5000
agent.channels.RecovMemChannel.wal.maxLogsSize = 1073741824
agent.channels.RecovMemChannel.capacity = 1000000
agent.channels.RecovMemChannel.transactionCapacity = 10000
agent.channels.RecovMemChannel.keep-alive = 3

# AvroSink
agent.sinks.AvroSink.type = avro
agent.sinks.AvroSink.hostname = 192.168.200.170
agent.sinks.AvroSink.port = 10900
agent.sinks.AvroSink.batch-size = 10000
#agent.sinks.AvroSink.channel = JdbcChannel
#agent.sinks.AvroSink.channel = MemChannel
agent.sinks.AvroSink.channel = RecovMemChannel


Agent with HDFS sink

agent.sources = AvroSrc
#agent.channels = MemChannel
#agent.channels = JdbcChannel
agent.channels = RecovMemChannel
agent.sinks = HdfsSink
# AvroSrc
agent.sources.AvroSrc.type = avro
agent.sources.AvroSrc.bind = 192.168.200.170
agent.sources.AvroSrc.port = 10900
agent.sources.AvroSrc.channels = RecovMemChannel
#agent.sources.AvroSrc.channels = JdbcChannel
#agent.sources.AvroSrc.channels = MemChannel
# MemChannel
agent.channels.MemChannel.type = memory
agent.channels.MemChannel.capacity = 1000000
agent.channels.MemChannel.transactionCapacity = 10000
agent.channels.MemChannel.stay-alive = 3
# JdbcChannel
agent.channels.JdbcChannel.type = jdbc
agent.channels.JdbcChannel.db.type = DERBY
agent.channels.JdbcChannel.driver.class =
org.apache.derby.jdbc.EmbeddedDriver
agent.channels.JdbcChannel.create.schema = true
agent.channels.JdbcChannel.create.index = true
agent.channels.JdbcChannel.create.foreignkey = true
agent.channels.JdbcChannel.maximum.connections = 10
agent.channels.JdbcChannel.maximum.capacity = 0
agent.channels.JdbcChannel.sysprop.user.home = /flume/data
# RecovMemChannel
agent.channels.RecovMemChannel.type =
org.apache.flume.channel.recoverable.memory.RecoverableMemoryChannel
agent.channels.RecovMemChannel.wal.dataDir =
/flume/recoverable-memory-channel
agent.channels.RecovMemChannel.wal.rollSize = 104857600
agent.channels.RecovMemChannel.wal.minRetentionPeriod = 3600000
agent.channels.RecovMemChannel.wal.workerInterval = 5000
agent.channels.RecovMemChannel.wal.maxLogsSize = 1073741824
agent.channels.RecovMemChannel.capacity = 1000000
agent.channels.RecovMemChannel.transactionCapacity = 10000
agent.channels.RecovMemChannel.keep-alive = 3
# HdfsSink
agent.sinks.HdfsSink.type = hdfs
agent.sinks.HdfsSink.hdfs.path = hdfs://master:50070/data/flume
agent.sinks.HdfsSink.hdfs.filePrefix = data_%Y%m%d
#agent.sinks.HdfsSink.channel = MemChannel
#agent.sinks.HdfsSink.channel = JdbcChannel
agent.sources.AvroSrc.channels = RecovMemChannel
agent.sinks.HdfsSink.hdfs.rollInterval = 300
agent.sinks.HdfsSink.hdfs.rollSize = 209715200
agent.sinks.HdfsSink.hdfs.rollCount = 0
agent.sinks.HdfsSink.hdfs.batchSize = 1000
agent.sinks.HdfsSink.hdfs.writeFormat = Text
agent.sinks.HdfsSink.hdfs.fileType = DataStream

-- 
Rgds
Ray

Mime
View raw message