hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amareshwari Sriramadasu <amar...@yahoo-inc.com>
Subject Re: Hadoop Streaming -file option
Date Wed, 25 Feb 2009 05:21:03 GMT
Arun C Murthy wrote:
>
> On Feb 23, 2009, at 2:01 AM, Bing TANG wrote:
>
>> Hi, everyone,
>> Could somdone tell me the principle of "-file" when using Hadoop
>> Streaming. I want to ship a big file to Slaves, so how it works?
>>
>> Hadoop uses "SCP" to copy? How does Hadoop deal with -file option?
>>
>
> No, -file just copies the file from the local filesystem to HDFS, and 
> the DistributedCache copies it to the local filesystem of the node on 
> which the map/reduce task runs.
>
-file option does not use DistributedCache yet. HADOOP-2622 is still 
open for the same. -file option ships the files along with the streaming 
jar. (it unpacks the jar and copy the files and pack the jar again). You 
can use -files, -libjars and -archives to copy the files to distributed 
cache.
-Amareshwari
> Arun
>


Mime
View raw message