hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Tariq <donta...@gmail.com>
Subject Re: Copy files from remote folder to HDFS
Date Fri, 25 Jan 2013 04:56:31 GMT
Hello Panshul,

      You might find flume <http://flume.apache.org/> useful.

Warm Regards,

On Fri, Jan 25, 2013 at 6:39 AM, Panshul Whisper <ouchwhisper@gmail.com>wrote:

> Hello,
> I am trying to copy files, Json files from a remote folder - (a folder on
> my local system, Cloudfiles folder or a folder on S3 server) to the HDFS of
> a cluster running at a remote location.
> The job submitting Application is based on Spring Hadoop.
> Can someone please suggest or point me in the right direction for best
> option to achieve the above task:
> 1. Use Spring Integration data pipelines to poll the folders for files and
> copy them to the HDFS as they arrive in the source folder. - I have tried
> to implement the solution in Spring Data book, but it does not run - no
> idea what is wrong as it does not generate logs.
> 2. Use some other script method to transfer files.
> Main requirement, I need to transfer files from a remote folder to HDFS
> everyday at a fixed time for processing in the hadoop cluster. These files
> are collecting from various sources in the remote folders.
> Please suggest an efficient approach. I have been searching and finding a
> lot of approaches but unable to decide what will work best. As this
> transfer needs to be as fast as possible.
> The files to be transferred will be almost 10 GB of Json files not more
> than 6kb each file.
> Thanking You,
> --
>  Regards,
> Ouch Whisper
> 010101010101

View raw message