hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mapred Learn <mapred.le...@gmail.com>
Subject Query regarding internal/working of hadoop fs -copyFromLocal and fs.write()
Date Tue, 31 May 2011 23:56:53 GMT
Hi guys,
I asked this question earlier but did not get any response. So, posting
again. Hope somebody can point to the right description:

When you do hadoop fs -copyFromLocal or use API to call fs.write() (when
Filesystem fs is HDFS), does it write to local filesystem first before
writing to HDFS ?

I read and found out that it writes on local file-system until block-size is
reached and then writes on HDFS.
Wouldn't HDFS Client choke if it writes to local filesystem if multiple such
fs -copyFromLocal commands are running. I thought atleast in fs.write(), if
you provide byte array, it should not write on local file-system ?

Some places I found out that hdfs client and datanode communicate through
rpc/sockets. Do they write on local file-systems also in this case or is it
just a buffer in memory that they write directly on HDFS.

Could somebody point me to some doc/code where I could find out how fs
-copyFromLocal and fs.write() work ? Do they write on local-filesystem
before block size is reached and then write to HDFS or write directly to

Thanks in advance,

View raw message