flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Zupan <mike.zu...@manage.com>
Subject Flume 1.4 High CPU
Date Wed, 15 Oct 2014 15:32:53 GMT
I’m seeing issues with flume server using very high amounts of CPU. Just wondering if this
is a common issue with a file channel. I’m pretty new to flume so sorry if this isn’t
enough to debug the issue.

Current top looks like  

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 8509 root      20   0 22.0g 8.6g 675m S 1109.4 13.7   1682:45 java
 8251 root      20   0 21.9g 8.3g 647m S 1083.5 13.2   1476:27 java
 7593 root      20   0 12.4g 8.4g  18m S 1007.5 13.4   1866:18 java

As you can see we have 3 out of 4 flume servers using 1000% cpu.  

Details are

OS: CentOS 6.5
Java: Oracle "1.7.0_45"

Flume: flume-1.4.0.2.1.1.0-385.el6.noarch

Our config for the server looks like this

###############################################
# Agent configuration for transactional data
###############################################
nontx_host07_agent01.sources = avro
nontx_host07_agent01.channels = fc
nontx_host07_agent01.sinks = hdfs_sink_01 hdfs_sink_02 hdfs_sink_03 hdfs_sink_04

##################################################
# info is published to port 9991
##################################################
nontx_host07_agent01.sources.avro.type = avro
nontx_host07_agent01.sources.avro.bind = 0.0.0.0
nontx_host07_agent01.sources.avro.port = 9991
nontx_host07_agent01.sources.avro.threads = 100
nontx_host07_agent01.sources.avro.compression-type = deflate
nontx_host07_agent01.sources.avro.interceptors = ts id
nontx_host07_agent01.sources.avro.interceptors.ts.type = timestamp
nontx_host07_agent01.sources.avro.interceptors.ts.preserveExisting = false
nontx_host07_agent01.sources.avro.interceptors.id.type = org.apache.flume.sink.solr.morphline.UUIDInterceptor$Builder
nontx_host07_agent01.sources.avro.interceptors.id.preserveExisting = true


##################################################
# The Channels
##################################################
nontx_host07_agent01.channels.fc.type = file
nontx_host07_agent01.channels.fc.checkpointDir = /flume/channels/checkpoint/nontx_host07_agent01
nontx_host07_agent01.channels.fc.dataDirs = /flume/channels/data/nontx_host07_agent01
nontx_host07_agent01.channels.fc.capacity = 140000000
nontx_host07_agent01.channels.fc.transactionCapacity = 240000

##################################################
# Sinks
##################################################
nontx_host07_agent01.sinks.hdfs_sink_01.type = hdfs
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.path = hdfs://cluster01:8020/flume/%{log_type}
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.filePrefix = flume_nontx_host07_agent01_sink01_%Y%m%d%H
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.inUsePrefix=_
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.inUseSuffix=.tmp
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.fileType = CompressedStream
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.codeC = snappy
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.rollSize = 0
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.rollCount = 0
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.rollInterval = 300
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.idleTimeout = 30
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.timeZone = America/Los_Angeles
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.callTimeout = 30000
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.batchSize = 50000
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.round = true
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.roundUnit = minute
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.roundValue = 5
nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.threadsPoolSize = 2
nontx_host07_agent01.sinks.hdfs_sink_01.serializer = com.manage.flume.serialization.HeaderAndBodyJsonEventSerializer$Builder


--  
Mike Zupan


Mime
View raw message