flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: Flume 1.3.0 - NFS + File Channel Performance
Date Tue, 18 Dec 2012 17:04:15 GMT
Yep. The disk space calls require an NFS call for each write, and that slows things down a
lot.  

--  
Hari Shreedharan


On Tuesday, December 18, 2012 at 8:43 AM, Brock Noland wrote:

> We'd need those thread dumps to help confirm but I bet that FLUME-1609
> results in a NFS call on each operation on the channel.
>  
> If that is true, that would explain why it works well on local disk.
>  
> Brock
>  
> On Tue, Dec 18, 2012 at 10:17 AM, Brock Noland <brock@cloudera.com (mailto:brock@cloudera.com)>
wrote:
> > Hi,
> >  
> > Hmm, yes in general performance is not going to be great over NFS, but
> > there haven't been any FC changes that stick out here.
> >  
> > Could you take 10 thread dumps of the agent running the file channel
> > and 10 thread dumps of the agent sending data to the agent with the
> > file channel? (You can address them to myself directly since the list
> > won't take attachements.)
> >  
> > Are there any patterns, like it works for 40 seconds then times out
> > and then works for 39 seconds, etc?
> >  
> > Brock
> >  
> > On Tue, Dec 18, 2012 at 10:07 AM, Rakos, Rudolf
> > <Rudolf.Rakos@morganstanley.com (mailto:Rudolf.Rakos@morganstanley.com)> wrote:
> > > Hi,
> > >  
> > >  
> > >  
> > > We’ve run into a strange problem regarding NFS and File Channel performance
> > > while evaluating the new version of Flume.
> > >  
> > > We had no issues with the previous version (1.2.0).
> > >  
> > >  
> > >  
> > > Our configuration looks like this:
> > >  
> > > · Node1:
> > > (Avro RPC Clients ->) Avro Source and Custom Sources -> File Channel
-> Avro
> > > Sink (-> Node 2)
> > >  
> > > · Node2:
> > > (Node1s ->) Avro Source -> File Channel -> Custom Sink
> > >  
> > >  
> > >  
> > > Both the checkpoint and the data directories of the File Channels are on NFS
> > > shares. We use the same share for checkpoint and data directories, but
> > > different shares for each Node. Unfortunately it is not an option for us to
> > > use local directories.
> > >  
> > > The events are about 1KB large, and the batch sizes are the following:
> > >  
> > > · Avro RPC Clients: 1000
> > >  
> > > · Custom Sources: 2000
> > >  
> > > · Avro Sink: 5000
> > >  
> > > · Custom Sink: 10000
> > >  
> > >  
> > >  
> > > We are experiencing very slow File Channel performance compared to the
> > > previous version, and high amount of timeouts (almost always) in the Avro
> > > RPC Clients and the Avro Sink.
> > >  
> > > Something like this:
> > >  
> > > · 2012-12-18 15:43:31,828
> > > [SinkRunner-PollingRunner-ExceptionCatchingSinkProcessor] WARN
> > > org.apache.flume.sink.AvroSink - Failed to send event batch
> > > org.apache.flume.EventDeliveryException: NettyAvroRpcClient { host: ***,
> > > port: *** }: Failed to send batch
> > > at
> > > org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:236)
> > > ~[flume-ng-sdk-1.3.0.jar:1.3.0]
> > > ***
> > > at
> > > org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> > > [flume-ng-core-1.3.0.jar:1.3.0]
> > > at java.lang.Thread.run(Thread.java:662) [na:1.6.0_31]
> > > Caused by: org.apache.flume.EventDeliveryException: NettyAvroRpcClient {
> > > host: ***, port: *** }: Handshake timed out after 20000ms
> > > at
> > > org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:280)
> > > ~[flume-ng-sdk-1.3.0.jar:1.3.0]
> > > at
> > > org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:224)
> > > ~[flume-ng-sdk-1.3.0.jar:1.3.0]
> > > ... 5 common frames omitted
> > > Caused by: java.util.concurrent.TimeoutException: null
> > > at
> > > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
> > > ~[na:1.6.0_31]
> > > at java.util.concurrent.FutureTask.get(FutureTask.java:91)
> > > ~[na:1.6.0_31]
> > > at
> > > org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:278)
> > > ~[flume-ng-sdk-1.3.0.jar:1.3.0]
> > > ... 6 common frames omitted
> > >  
> > > (I had to remove some details, sorry for that.)
> > >  
> > >  
> > >  
> > > We managed to narrow down the root cause of the issue to the File Channel,
> > > because:
> > >  
> > > · Everything works fine if we switch to the Memory Channel or to the
> > > Old File Channel (1.2.0).
> > >  
> > > · Everything works fine if we use local directories.
> > >  
> > > We’ve tested this on multiple different PCs (both Windows and Linux).
> > >  
> > >  
> > >  
> > > I spent the day debugging and profiling, but I could not find anything worth
> > > mentioning (nothing with excessive CPU usage, no threads are waiting too
> > > much, etc…). The only problem is that File Channel takes and puts take way
> > > more time than with the previous version.
> > >  
> > >  
> > >  
> > >  
> > >  
> > > Could someone please try the File Channel on an NFS share?
> > >  
> > > Does anyone have similar issues?
> > >  
> > >  
> > >  
> > > Thank you for your help.
> > >  
> > >  
> > >  
> > > Regards,
> > >  
> > > Rudolf
> > >  
> > >  
> > >  
> > > Rudolf Rakos
> > > Morgan Stanley | ISG Technology
> > > Lechner Odon fasor 8 | Floor 06
> > > Budapest, 1095
> > > Phone: +36 1 881-4011
> > > Rudolf.Rakos@morganstanley.com (mailto:Rudolf.Rakos@morganstanley.com)
> > >  
> > >  
> > > Be carbon conscious. Please consider our environment before printing this
> > > email.
> > >  
> > >  
> > >  
> > >  
> > > ________________________________
> > >  
> > > NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions
> > > or views contained herein are not intended to be, and do not constitute,
> > > advice within the meaning of Section 975 of the Dodd-Frank Wall Street
> > > Reform and Consumer Protection Act. If you have received this communication
> > > in error, please destroy all electronic and paper copies and notify the
> > > sender immediately. Mistransmission is not intended to waive confidentiality
> > > or privilege. Morgan Stanley reserves the right, to the extent permitted
> > > under applicable law, to monitor electronic communications. This message is
> > > subject to terms available at the following link:
> > > http://www.morganstanley.com/disclaimers If you cannot access these links,
> > > please notify us by reply message and we will send the contents to you. By
> > > messaging with Morgan Stanley you consent to the foregoing.
> > >  
> >  
> >  
> >  
> >  
> > --
> > Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
> >  
>  
>  
>  
>  
> --  
> Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
>  
>  



Mime
View raw message