flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: Low throughput of FileChannel
Date Fri, 03 Aug 2012 03:27:45 GMT
Denny,  

Please file a jira and post your code changes if you would like to contribute it to Apache
Flume. One of us will be glad to review and commit it. This way, it will benefit the community
in general. This will also allow us to discuss the performance benefits of your code changes.
 

Thanks,
Hari

--  
Hari Shreedharan


On Thursday, August 2, 2012 at 8:22 PM, Denny Ye wrote:

> hi Hari,  
>     Mostly channels in my production environment will be configured with FileChannel.
It may impact our platform performance. Also I'm not sure if anyone already have got better
throughput. If anyone have similar result with me, I'd like to post my code changes to discuss.
>  
> -Regards
> Denny Ye
>  
> 2012/8/3 Hari Shreedharan <hshreedharan@cloudera.com (mailto:hshreedharan@cloudera.com)>
> > Denny,
> >  
> > I am not sure if anyone has actually benchmarked the FileChannel. What kind of performance
are you getting as of now? If you have a patch that can improve the performance a lot, please
feel free to submit it. We'd definitely like to get such a patch committed.
> >  
> > Thanks
> > Hari
> >  
> > --
> > Hari Shreedharan
> >  
> >  
> > On Thursday, August 2, 2012 at 8:02 PM, Denny Ye wrote:
> >  
> > > hi all,
> > >     I posted performance of MemoryChannel last week. That's normal throughput
in most environment. Therefore, the performance result of FileChannel is below expectation
with same environments and parameters, almost 5MB/s.
> > >
> > >     I want to know your throughput result of FileChannel specially. Am I walking
with wrong way? It's hard to believe the result.
> > >
> > >    Also I have tuning with several code changes, the throughput increasing
to 30MB/s. I think there also have lots of points to impact the performance.
> > >
> > >     Any guys, would you give me your throughput result or feedback for tuning?
> > >
> > > -Regards
> > > Denny Ye
> > >
> > >
> > > ---------- Forwarded message ----------
> > > From: Denny Ye <dennyy99@gmail.com (mailto:dennyy99@gmail.com) (mailto:dennyy99@gmail.com)>
> > > Date: 2012/7/25
> > > Subject: Latest Flume test report and problem
> > > To: dev@flume.apache.org (mailto:dev@flume.apache.org) (mailto:dev@flume.apache.org)
> > >
> > >
> > > hi all,
> > >    I tested Flume in last week with ScribeSource(https://issues.apache.org/jira/browse/FLUME-1382)
and HDFS Sink. More detailed conditions and deployment cases listed below. Too many 'Full
GC' impact the throughput and amount of events promoted into old generation. I have applied
some tuning methods, no much effect.
> > >    Could someone give me your feedback or tip to reduce the GC problem? Wish
your attention.
> > >
> > > PS: Using Mike's report template at https://cwiki.apache.org/FLUME/flume-ng-performance-measurements.html
> > >
> > > Flume Performance Test 2012-07-25
> > > Overview
> > > The Flume agent was run on its own physical machine in a single JVM. A separate
client machine generated load against the Flume box in List<LogEntry> format. Flume
stored data onto a 4-node HDFS cluster configured on its own separate hardware. No virtual
machines were used in this test.
> > > Hardware specs
> > > CPU: Inter Xeon L5640 2 x quad-core @ 2.27 GHz (12 physical cores)
> > > Memory: 16 GB
> > > OS: CentOS release 5.3 (Final)
> > > Flume configuration
> > > JAVA Version: 1.6.0_20 (Java HotSpot 64-Bit Server VM)
> > > JAVA OPTS: -Xms1024m -Xmx4096m -XX:PermSize=256m -XX:NewRatio=1 -XX:SurvivorRatio=5
-XX:InitialTenuringThreshold=15 -XX:MaxTenuringThreshold=31 -XX:PretenureSizeThreshold=4096
> > > Num. agents: 1
> > > Num. parallel flows: 5
> > > Source: ScribeSource
> > > Channel: MemoryChannel
> > > Sink: HDFSEventSink
> > > Selector: RandomSelector
> > > Config-file
> > > # list sources, channels, sinks for the agent
> > > agent.sources = seqGenSrc
> > > agent.channels = mc1 mc2 mc3 mc4 mc5
> > > agent.sinks = hdfsSin1 hdfsSin2 hdfsSin3 hdfsSin4 hdfsSin5
> > >
> > > # define sources
> > > agent.sources.seqGenSrc.type = org.apache.flume.source.scribe.ScribeSource
> > > agent.sources.seqGenSrc.selector.type = io.flume.RandomSelector
> > >
> > > # define sinks
> > > agent.sinks.hdfsSin1.type = hdfs
> > > agent.sinks.hdfsSin1.hdfs.path = /flume_test/data1/
> > > agent.sinks.hdfsSin1.hdfs.rollInterval = 300
> > > agent.sinks.hdfsSin1.hdfs.rollSize = 0
> > > agent.sinks.hdfsSin1.hdfs.rollCount = 1000000
> > > agent.sinks.hdfsSin1.hdfs.batchSize = 10000
> > > agent.sinks.hdfsSin1.hdfs.fileType = DataStream
> > > agent.sinks.hdfsSin1.hdfs.txnEventMax = 1000
> > > # ... define sink #2 #3 #4 #5 ...
> > >
> > > # define channels
> > > agent.channels.mc1.type = memory
> > > agent.channels.mc1.capacity = 1000000
> > > agent.channels.mc1.transactionCapacity = 1000
> > > # ... define channel #2 #3 #4 #5 ...
> > >
> > > # specify the channel each sink and source should use
> > > agent.sources.seqGenSrc.channels = mc1 mc2 mc3 mc4 mc5
> > > agent.sinks.hdfsSin1.channel = mc1
> > > # ... specify sink #2 #3 #4 #5 ...
> > > Hadoop configuration
> > > The HDFS sink was connected to a 4-node Hadoop cluster running CDH3u1. For
different HDFS sink, HDFS wrote data into different path.
> > > Visualization of test setup
> > > https://lh3.googleusercontent.com/dGumq1pu1Wr3Bj8WJmRHOoLWmUlGqxC4wW7_XCNO9R1wuh15LRXaKKxGoccpjBXtgqcdSVW-vtg
> > > There are 10 Scribe Clients and each client send 20 million LogEntry objects
to ScribleSource.
> > > Data description
> > > List<LogEntry> entries containing a string category and a ByteArray body.
The ByteArray body size is 500 bytes.
> > > Results
> > > Throughput:
> > > Average:       Source: 46.4 MB/s, Sink: 45.2 MB/s
> > > Maximum:    Source: 67.1 MB/s, Sink: 88.3 MB/s
> > >
> > > CPU:       Average: 196%, Maximum: 440%
> > >
> > > GC:         Young GC: 1636 times,      Full GC: 384 times
> > >
> > > No data loss.
> > > Heap and GC
> > > By analyzing JVM Heap, we found that there are many LogEntry objects in OldGen.
We have tried to carry out some optimizations, but the results are not satisfactory. We will
continue to track this limitation.
> > >
> > > FullGC Log examples:
> > > [Full GC [PSYoungGen: 1497984K->0K(1797568K)] [PSOldGen: 1720643K->1693741K(2097152K)]
3218627K->1693741K(3894720K) [PSPermGen: 14566K->14566K(262144K)], 5.0027700 secs] [Times:
user=5.01 sys=0.00, real=5.00 secs]
> > > [Full GC [PSYoungGen: 1497960K->0K(1797568K)] [PSOldGen: 1693805K->1752540K(2097152K)]
3191765K->1752540K(3894720K) [PSPermGen: 14571K->14571K(262144K)], 5.0732570 secs] [Times:
user=5.07 sys=0.00, real=5.07 secs]
> > > [Full GC [PSYoungGen: 1497984K->0K(1797568K)] [PSOldGen: 1752540K->1642553K(2097152K)]
3250524K->1642553K(3894720K) [PSPermGen: 14572K->14568K(262144K)], 5.0710730 secs] [Times:
user=5.07 sys=0.01, real=5.08 secs]
> > >
> > >
> > > ´╝ŹRegards
> > > Denny Ye
> > >
> > >
> >  
>  


Mime
View raw message