flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Haidang N <haidang...@hotmail.com>
Subject Reading Flume spoolDir in parallel
Date Tue, 16 Sep 2014 18:26:12 GMT
Since I'm not allowed to set up Flume on prod servers, I have to download the logs, put them
in a Flume spoolDir and have a sink to consume from the channel and write to Cassandra. Everything
is working fine.
However, as I have a lot of log files in the spoolDir, and the current setup is only processing
1 file at a time, it's taking a while. I want to be able to process many files concurrently.
One way I thought of is to use the spoolDir but distribute the files into 5-10 different directories,
and define multiple sources/channels/sinks, but this is a bit clumsy. Is there a better way
to achieve this?

View raw message