flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Ehrlich <and...@aehrlich.com>
Subject Re: Collecting thousands of sources
Date Thu, 04 Sep 2014 15:41:43 GMT
One way to avoid managing so many sources would be to have an aggregation point between the
data generators the flume sources. For example, maybe you could have the data generators put
events into a message queue(s), then have flume consume from there?

Andrew

---- On Thu, 04 Sep 2014 08:29:04 -0700 JuanFra Rodriguez Cardoso&lt;juanfra.rodriguez.cardoso@gmail.com&gt;
wrote ---- 


Hi all:

Considering an environment with thousands of sources, which are the best practices for managing
the agent configuration (flume.conf)? Is it recommended to create a multi-layer topology where
each agent takes control of a subset of sources?
 
In that case, a conf mgmg server (such as Puppet) would be responsible for editing flume.conf
 with parameters 'agent.sources' from source1 to source3000 (assuming we have 3000 sources
machines).

Are my thoughts aligned with that scenarios of large scale data ingest?
 
Thanks a lot!
---
JuanFra



 
 



Mime
View raw message