flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthikeyan Muthukumarasamy <mkarthik.h...@gmail.com>
Subject Flume latency issue
Date Thu, 27 Sep 2012 13:55:37 GMT
In my project various applications and 3PPs write log into to their
separate logfiles.
There are two limitations with this:
- the structure of the log messages are different in each log file
- the log messages are in different files and I cant get a single time
sorted display of all log messages, which is important in some debug

As a solution to this problem, I intend to:
- use separate flume sources to tail various log files in the system
- have interceptors for each type of flume source and convert all log
messages to a common structure
- all flume sinks will write to a localhost avro port
- a separate flume source will read from the avro port on localhost
- there will be a fan-out logic to post the data from that source to
multiple channels
- each connel is connected to a separate sink like JMX sink, HBase Sink etc

First of all, is this kind of usage of flume acceptable and is there
anything I need to specifically take care of?

I also notice that the consolidated avro source which reads data from avro
port gets data only as blocks from each source, the latency is around 20
seconds. Is it possible to reduce this latency, so at the consolidated avro
source, I receive all events as they are getting logged into their log
files, instantaneously?

Thanks in advance!

View raw message