hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rab ra <rab...@gmail.com>
Subject Re: HDFS data transfer is faster than SCP based transfer?
Date Sat, 25 Jan 2014 03:44:13 GMT
It is not a single file. Lot of small files. Files are stored in HDFS and
map operations copies required files from hdfs. One map process running in
one node only. Each file will be about 16MB
On 24 Jan 2014 23:49, "Vinod Kumar Vavilapalli" <vinodkv@hortonworks.com>

> Is it a single file? Lots of files? How big are the files? Is the copy on
> a single node or are you running some kind of a MapReduce program?
> +Vinod
> Hortonworks Inc.
> http://hortonworks.com/
> On Fri, Jan 24, 2014 at 7:21 AM, rab ra <rabmdu@gmail.com> wrote:
>> Hi
>> Can anyone please answer my query?
>> -Rab
>> ---------- Forwarded message ----------
>> From: "rab ra" <rabmdu@gmail.com>
>> Date: 24 Jan 2014 10:55
>> Subject: HDFS data transfer is faster than SCP based transfer?
>> To: <user@hadoop.apache.org>
>> Hello
>> I have a use case that requires transfer of input files from remote
>> storage using SCP protocol (using jSCH jar).  To optimize this use case, I
>> have pre-loaded all my input files into HDFS and modified my use case so
>> that it copies required files from HDFS. So, when tasktrackers works, it
>> copies required number of input files to its local directory from HDFS. All
>> my tasktrackers are also datanodes. I could see my use case has run faster.
>> The only modification in my application is that file copy from HDFS instead
>> of transfer using SCP. Also, my use case involves parallel operations (run
>> in tasktrackers) and they do lot of file transfer. Now all these transfers
>> are replaced with HDFS copy.
>> Can anyone tell me HDFS transfer is faster as I witnessed? Is it because,
>> it uses TCP/IP? Can anyone give me reasonable reasons to support the
>> decrease of time?
>> with thanks and regards
>> rab
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

View raw message