flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From alo alt <wget.n...@gmail.com>
Subject Re: Throughput issue/load balancing
Date Tue, 03 Jul 2012 05:47:32 GMT
Hi,

No, if one collector dies the other will be used. You can create crossover sinks like:

Agent1 => Collector2,Collector1
Agent2 => Collector1, Collector2

If one collector dies, the other have to handle the load from the other too.

best,
Alex


On Jul 3, 2012, at 5:18 AM, Camp, Roy wrote:

> Flume 0.9.4-cdh3u3
> 
> We have a couple dozen agents connecting to two collectors and a majority of events appear
to be flowing into one collector.  I am trying to load archived data from two of the agents
(two different hosts) but they hit a throughput limit.  When investigating it appears that
all the events are only flowing to one collector.  Shouldn't flume be utilizing the second
collector for increased throughput?
> 
> My config:
> All agents: thriftSource(12345) | autoE2EChain;
> Both collectors:  autoCollectorSource | [ackChecker cassandraBasin("analyticsks", "127.0.0.1:9160"),
< lazyOpen stubbornAppend thriftSink("127.0.0.1",30313) ? diskFailover insistentOpen lazyOpen
stubbornAppend thriftSink("127.0.0.1",30313) >];
> 
> Agent logs show them opening a connection to both collectors upon startup.
> 
> Thanks,
> 
> Roy


--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF


Mime
View raw message