hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: Query regarding internal/working of hadoop fs -copyFromLocal and fs.write()
Date Wed, 01 Jun 2011 00:05:34 GMT
They write directly to HDFS, there's no additional buffering on the
local file system of the client.


On Tue, May 31, 2011 at 7:56 PM, Mapred Learn <mapred.learn@gmail.com> wrote:
> Hi guys,
> I asked this question earlier but did not get any response. So, posting
> again. Hope somebody can point to the right description:
> When you do hadoop fs -copyFromLocal or use API to call fs.write() (when
> Filesystem fs is HDFS), does it write to local filesystem first before
> writing to HDFS ?
> I read and found out that it writes on local file-system until block-size is
> reached and then writes on HDFS.
> Wouldn't HDFS Client choke if it writes to local filesystem if multiple such
> fs -copyFromLocal commands are running. I thought atleast in fs.write(), if
> you provide byte array, it should not write on local file-system ?
> Some places I found out that hdfs client and datanode communicate through
> rpc/sockets. Do they write on local file-systems also in this case or is it
> just a buffer in memory that they write directly on HDFS.
> Could somebody point me to some doc/code where I could find out how fs
> -copyFromLocal and fs.write() work ? Do they write on local-filesystem
> before block size is reached and then write to HDFS or write directly to
> HDFS ?
> Thanks in advance,
> -JJ

Joseph Echeverria
Cloudera, Inc.

View raw message