hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wilm Schumacher <wilm.schumac...@gmail.com>
Subject Re: Copying files to hadoop.
Date Wed, 17 Dec 2014 22:56:07 GMT
Am 17.12.2014 um 23:29 schrieb Anil Jagtap:
> Dear All,
>
> I'm pretty new to Hadoop technology and Linux environment hence
> struggling even to find solutions for the basic stuff.
>
> For now, Hortonworks Sandbox is working fine for me and i managed to
> connect to it thru SSH.
>
> Now i have some csv files in my mac os folders which i want to copy
> onto Hadoop. As per my knowledge i can copy those files first to Linux
> and then put to Hadoop. But is there a way in which just in one
> command it will copy to Hadoop directly from mac os folder?
yes, there is.

cat /path/to/your/local/file.csv | ssh hadoopuser@namenode
"/remote/server/path/to/hadoop fs -put - /hadoop/folder/name/file.csv"

As you wrote, that you are also new to linux/unix, this above means:

* cat => concanate the files (only one file given) and print to standard
output

* pipe | => means, write the standard output from the left hand to the
standard input of the right hand side

* ssh reads from standard input and writes its to the standard input on
the remote server command, which is hadoop fs put command, which is told
to read from stdin

Thus you are actually piping the content of the file through 3 services.
And that's actually a little bit of a hack and in my opinion there is no
reason to do this if your file is reasonable small to fit on the remote
server. It's like asking "is it possible to reach my destination only
using left turns". Well ... it's possible, but not always a good idea ;).

Best

Wilm

Mime
View raw message