flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: NettyAvroRPCClient hangs Client JVM
Date Fri, 04 Oct 2013 21:44:04 GMT
The memory channel uses a LinkedBlockingDeque internally for its buffer. So each insert and
delete causes nodes to be created and removed resulting in some GC. When the rate of write
is extremely high, this might become visible. Can you try the File Channel - preferably with
several disks. There should be much less GC there, 


Thanks,
Hari


On Friday, October 4, 2013 at 1:51 PM, Bhaskar V. Karambelkar wrote:

> Hi Hari,
> 
> Yes I am using memory channel. The boxes have 16GB RAMs, we're running with 8GB heap
each.
> Each memory channel capacity is 1 Million (4 Sinks so 4 Million in total), and transaction
size is 10K per sink.
> 
> Batch size is also set to 10K, we've played with these values , but the issue is persistent.

> Can you elaborate as to what issues these are, and what exactly takes place ?
> A step by step deconstruction of the problem would really help me in understanding what's
going on.
> 
> thanks
> Bhaskar
> 
> 
> 
> 
> On Fri, Oct 4, 2013 at 11:57 AM, Hari Shreedharan <hshreedharan@cloudera.com (mailto:hshreedharan@cloudera.com)>
wrote:
> > Are you using the Memory Channel on the agents? We do know that there might be some
issues when the memory channel is used when the heap is pretty large. We are wokring to resolve
it. 
> > 
> > 
> > Thanks,
> > Hari
> > 
> > 
> > On Friday, October 4, 2013 at 5:34 AM, Bhaskar V. Karambelkar wrote:
> > 
> > > We've a client JVM process which uses flume client SDK (NettyAvroRPCClient)
to push events to a flume source which ultimately lands in HDFS.
> > > 
> > > On production we're still on flume 1.3, and one thing we find consistently
is that under heavy load, the client JVM hangs. We've narrowed it down to the Flume client
SDK,  
> > > 
> > > From what I suspect a long GC pause in flume agent, causes disconnects in avro
client, which can lead to client JVM hangs.
> > > 
> > > We're getting event's at the rate of about 25,000/sec, which are distributed
across 8 clients, and they in turn forward them to 24 flume sources ( 6 boxes with 4 sources
each). and each source writes to HDFS (i.e. 24 HDFS sinks as well). 
> > > 
> > > I tried switching flume agents GC to G1, which sort of helped, earlier the
client JVM hangs were about 5 mins apart, now it's about 10 mins, so there is some progress.
> > > 
> > > Question is how to completely eliminate these hangs. The hang is so bad, I
can't even get the JVM to do a thread dump, so possible way for me to investigate what caused
the JVM to hang. 
> > > 
> > > Could upgrading to 1.4, and using Thrift source help ? 
> > 
> 


Mime
View raw message