flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashish <paliwalash...@gmail.com>
Subject Re: Flume 1.4 High CPU
Date Thu, 16 Oct 2014 08:37:19 GMT
I would start with trying to find which Thread is consuming most CPU. The
stacktrace shall give you a good hint on the direction to proceed.

Blogged about the process here
http://www.ashishpaliwal.com/blog/2011/08/finding-java-thread-consuming-high-cpu/

Hope it help
ashish

On Wed, Oct 15, 2014 at 9:02 PM, Mike Zupan <mike.zupan@manage.com> wrote:

>  I’m seeing issues with flume server using very high amounts of CPU. Just
> wondering if this is a common issue with a file channel. I’m pretty new to
> flume so sorry if this isn’t enough to debug the issue.
>
> Current top looks like
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  8509 root      20   0 22.0g 8.6g 675m S 1109.4 13.7   1682:45 java
>  8251 root      20   0 21.9g 8.3g 647m S 1083.5 13.2   1476:27 java
>  7593 root      20   0 12.4g 8.4g  18m S 1007.5 13.4   1866:18 java
>
> As you can see we have 3 out of 4 flume servers using 1000% cpu.
>
> Details are
>
> OS: CentOS 6.5
> Java: Oracle "1.7.0_45"
> Flume: flume-1.4.0.2.1.1.0-385.el6.noarch
>
> Our config for the server looks like this
>
> ###############################################
> # Agent configuration for transactional data
> ###############################################
> nontx_host07_agent01.sources = avro
> nontx_host07_agent01.channels = fc
> nontx_host07_agent01.sinks = hdfs_sink_01 hdfs_sink_02 hdfs_sink_03
> hdfs_sink_04
>
> ##################################################
> # info is published to port 9991
> ##################################################
> nontx_host07_agent01.sources.avro.type = avro
> nontx_host07_agent01.sources.avro.bind = 0.0.0.0
> nontx_host07_agent01.sources.avro.port = 9991
> nontx_host07_agent01.sources.avro.threads = 100
> nontx_host07_agent01.sources.avro.compression-type = deflate
> nontx_host07_agent01.sources.avro.interceptors = ts id
> nontx_host07_agent01.sources.avro.interceptors.ts.type = timestamp
> nontx_host07_agent01.sources.avro.interceptors.ts.preserveExisting = false
> nontx_host07_agent01.sources.avro.interceptors.id.type =
> org.apache.flume.sink.solr.morphline.UUIDInterceptor$Builder
> nontx_host07_agent01.sources.avro.interceptors.id.preserveExisting = true
>
>
> ##################################################
> # The Channels
> ##################################################
> nontx_host07_agent01.channels.fc.type = file
> nontx_host07_agent01.channels.fc.checkpointDir =
> /flume/channels/checkpoint/nontx_host07_agent01
> nontx_host07_agent01.channels.fc.dataDirs =
> /flume/channels/data/nontx_host07_agent01
> nontx_host07_agent01.channels.fc.capacity = 140000000
> nontx_host07_agent01.channels.fc.transactionCapacity = 240000
>
> ##################################################
> # Sinks
> ##################################################
> nontx_host07_agent01.sinks.hdfs_sink_01.type = hdfs
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.path =
> hdfs://cluster01:8020/flume/%{log_type}
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.filePrefix =
> flume_nontx_host07_agent01_sink01_%Y%m%d%H
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.inUsePrefix=_
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.inUseSuffix=.tmp
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.fileType = CompressedStream
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.codeC = snappy
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.rollSize = 0
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.rollCount = 0
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.rollInterval = 300
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.idleTimeout = 30
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.timeZone = America/Los_Angeles
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.callTimeout = 30000
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.batchSize = 50000
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.round = true
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.roundUnit = minute
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.roundValue = 5
> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.threadsPoolSize = 2
> nontx_host07_agent01.sinks.hdfs_sink_01.serializer =
> com.manage.flume.serialization.HeaderAndBodyJsonEventSerializer$Builder
>
> --
> Mike Zupan
>
>


-- 
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal

Mime
View raw message