hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kaalu Singh <kaalusingh1...@gmail.com>
Subject Re: Question about Flume
Date Wed, 22 Jan 2014 23:20:52 GMT
The closest built-in functionality to the use case I have is the "Spooling
Directory Source" and I like the idea of using/building software with
higher level languages like Java for reasons of extensibility etc (and
don't like the idea of scripts).

However, I am soliciting opinions and can be swayed to change my mind.

Thanks for your response Dhaval - appreciate it.


On Wed, Jan 22, 2014 at 2:58 PM, Dhaval Shah <prince_mithibai@yahoo.co.in>wrote:

> Flume is useful for online log aggregation in a streaming format. Your use
> case seems more like a batch format where you just need to grab the file
> and put it in HDFS at regular intervals which can be much more easily
>  achieved by a bash script running on a cron'd basis.
> Regards,
> Dhaval
> ________________________________
> From: Kaalu Singh <kaalusingh1234@gmail.com>
> To: user@hadoop.apache.org
> Sent: Wednesday, 22 January 2014 5:52 PM
> Subject: Question about Flume
> Hi,
> I have the following use case:
> I have data files getting generated frequently on a certain machine, X.
> The only way I can bring them into my Hadoop cluster  is by SFTPing at
> certain intervals of time and getting them and landing them in HDFS.
> I am new to Hadoop and to Flume. I read up about Flume and it seems like
> this framework is appropriate for something like this although I did not
> see an available 'source' that can do exactly what I am looking for.
> Unavailability of a 'source' plugin is not a deal breaker for me as I can
> write one but first I want to make sure this is the right way to go. So, my
> questions are:
> 1. What are the pros/cons of using Flume for this use case?
> 2. Does anybody know of a source plugin that does what I am looking for?
> 3. Does anybody think I should not use Flume and instead write my own
> application to achieve this use case?
> Thanks
> KS

View raw message