flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kamal Bahadur <mailtoka...@gmail.com>
Subject Re: Cassandra Sink using Hector
Date Tue, 18 Oct 2011 04:58:53 GMT
Hi Dani,

Thanks for the reply. I am using E2E relaibility mode. If I spawn new thread
for each append call, I am not sure if the acks will be handled properly. I
might lose an event if the child thread ends up in an exception. Do you have
any suggestion for my use case? With current setup, I am able to write only
500 events per second. The expected events rate is over 2000 per second. I
tried to increase the number of collectors and it seems to help. Is this my
only option?

Thanks,
Kamal

On Mon, Oct 17, 2011 at 4:42 PM, Dani Rayan <dani.rayan@gmail.com> wrote:

> Hey Kamal,
>
> You are correct. The append method would not spawn new threads by itself.
> However, you can still override it.
>
>
> On Mon, Oct 17, 2011 at 1:58 PM, Kamal Bahadur <mailtokamal@gmail.com>wrote:
>
>> Hi,
>>
>> I have written a sink for writing data into Casandra using Hector API. It
>> looks like Hector does a great job of connection pooling and load balancing.
>> As soon as I start the collector, I can see 16 conections being established
>> between collector and cassandra. I am not sure if flume is taking advantage
>> of those connections in the pool. I am assuming that, Collector's append
>> method is not multi-threaded and therefore only one connection is being used
>> at any point of time. Can someone confirm this?
>>
>> Thanks,
>> Kamal
>>
>
>
>
> --
> -Dani Abel Rayan
>

Mime
View raw message