flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lohit <lohit.vijayar...@gmail.com>
Subject HDFS Sink performance
Date Wed, 15 Jul 2015 16:10:55 GMT
Hello,

Does anyone have some numbers which they can share around HDFS sink
performance. From our testing, for single sink writing to HDFS
(CompressedStream) and reading from MemoryChannel can only do about 35000
events per second (each event is about 1K) in size. After compression this
turns out to be ~10MB/s write stream to HDFS file. Which is pretty low. Our
configuration looks like this

agent.sinks.hdfsSink.type = hdfs
agent.sinks.hdfsSink.channel = memoryChannel
agent.sinks.hdfsSink.hdfs.path = /tmp/lohit
agent.sinks.hdfsSink.hdfs.codeC = lzo
agent.sinks.hdfsSink.hdfs.fileType = CompressedStream
agent.sinks.hdfsSink.hdfs.writeFormat = Writable
agent.sinks.hdfsSink.hdfs.rollInterval = 3600
agent.sinks.hdfsSink.hdfs.rollSize = 1073741824
agent.sinks.hdfsSink.hdfs.rollCount = 0
agent.sinks.hdfsSink.hdfs.batchSize = 10000
agent.sinks.hdfsSink.hdfs.txnEventMax = 10000

agent.channels.memoryChannel.type = memory

agent.channels.memoryChannel.capacity = 3000000
agent.channels.memoryChannel.transactionCapacity = 10000

-- 
Have a Nice Day!
Lohit

Mime
View raw message