hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kaalu Singh <kaalusingh1...@gmail.com>
Subject Question about Flume
Date Wed, 22 Jan 2014 22:52:01 GMT

I have the following use case:

I have data files getting generated frequently on a certain machine, X. The
only way I can bring them into my Hadoop cluster  is by SFTPing at certain
intervals of time and getting them and landing them in HDFS.

I am new to Hadoop and to Flume. I read up about Flume and it seems like
this framework is appropriate for something like this although I did not
see an available 'source' that can do exactly what I am looking for.
Unavailability of a 'source' plugin is not a deal breaker for me as I can
write one but first I want to make sure this is the right way to go. So, my
questions are:

1. What are the pros/cons of using Flume for this use case?
2. Does anybody know of a source plugin that does what I am looking for?
3. Does anybody think I should not use Flume and instead write my own
application to achieve this use case?


View raw message