hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohan Radhakrishnan <radhakrishnan.mo...@gmail.com>
Subject Managed File Transfer
Date Mon, 07 Jul 2014 14:02:08 GMT
           We used a commercial FT and scheduler tool in clustered mode.
This was a traditional active-active cluster that supported multiple
protocols like FTPS etc.

    Now I am interested in evaluating a Distributed way of crawling FTP
sites and downloading files using Hadoop. I thought since we have to
process thousands of files Hadoop jobs can do it.

Are Hadoop jobs used for this type of file transfers ?

Moreover there is a requirement for a scheduler  also. What is the
recommendation of the forum ?


View raw message