chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Mervine <>
Subject Trying to determine if Chukwa is what I need
Date Fri, 27 Jun 2014 20:38:24 GMT
Hey I came across chukwa from a blog post. And it looks like it  there is a real effort in
collecting data from multiple sources and pumping it into the HDFS.

I was looking at this pdf from the wiki

And the chart in the middle seems to imply that 2 of the agents you can have is one that takes
in streaming data and one that is associated with Log4J and works with log files in particular.

I'm pretty new to Hadoop so I'm trying to learn a lot about it in a short time, but what I'm
looking for is some kind of system that will monitor a directory somewhere for files being
placed there. I don't know what kind of files they could be, csv's, psv's, doc's, txt's, and
many others. A later stage would be formatting, parsing and analyzing but for now I just want
to be able to detect when a File is placed there. After a file has been detected than it should
be sent on it's way to be placed into the HDFS. This should be a completely autonomous and
automatic process (or as much as possible).

Is this something Chukwa can help me with? If not do you know of any system that might do
what I want? I've kind of read a little about Oozie, Falcon, Flume, Scribe, and a couple other
projects but I don't think I've found what I'm looking for.  Also any information you could
provide to help me on my way or clear up any misunderstanding I may have would be great!


View raw message