hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: Downloading data directly into HDFS
Date Thu, 29 Nov 2012 13:44:37 GMT
Not really the best tool.  ?Fuse? (Forget the name) 

You do have other options. I saw one group took an open source FTP server and then extended
it to write to HDFS. YMMV, however the code to open a file on HDFS and to write to it is pretty
trivial and straight forward.  Not sure why Cloudera or Hortonworks hasn't add this to their
management tools and then added it back to Apache. (Of course that would assume that the underlying
FTP service that is open source isn't GPLd and the license is compatible w Apache license....

MapR's FS is more Posix compliant and more stable as an NFS mountable file system.  With respect
to Fuse... not so much. Unless its been upgraded and better maintained. 

It would be nice if the committers to Apache were to look back and rethink some of the features
of HDFS... (Just saying.)  

Its been a while since I followed Cleversafe's work. They may offer an alternative, and again
its not core Hadoop. 



On Nov 28, 2012, at 8:22 PM, Manoj Babu <manoj444@gmail.com> wrote:

> You can take look on this http://wiki.apache.org/hadoop/MountableHDFS
> Cheers!
> Manoj.
> On Thu, Nov 29, 2012 at 1:33 AM, Uri Laserson <laserson@cloudera.com> wrote:
> What is the best way to download data directly into HDFS from some remote source?
> I used this command, which works:
> curl <remote_url> | funzip | hadoop fs -put - /path/filename
> Is this the recommended way to go?
> Uri
> -- 
> Uri Laserson, PhD
> Data Scientist, Cloudera
> Twitter/GitHub: @laserson
> +1 617 910 0447
> laserson@cloudera.com

View raw message