hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amareshwari Sriramadasu <amar...@yahoo-inc.com>
Subject Re: Using addCacheArchive
Date Fri, 26 Jun 2009 03:34:29 GMT
Hi Akhil,

DistributedCache.addCacheArchive takes path on hdfs. From your code, it looks like you are
passing local path.
Also, if you want to create symlink, you should pass URI as hdfs://<path>#<linkname>,
besides calling  


akhil1988 wrote:
> Please ask any questions if I am not clear above about the problem I am
> facing.
> Thanks,
> Akhil
> akhil1988 wrote:
>> Hi All!
>> I want a directory to be present in the local working directory of the
>> task for which I am using the following statements: 
>> DistributedCache.addCacheArchive(new URI("/home/akhil1988/Config.zip"),
>> conf);
>> DistributedCache.createSymlink(conf);
>>>> Here Config is a directory which I have zipped and put at the given
>>>> location in HDFS
>> I have zipped the directory because the API doc of DistributedCache
>> (http://hadoop.apache.org/core/docs/r0.20.0/api/index.html) says that the
>> archive files are unzipped in the local cache directory :
>> DistributedCache can be used to distribute simple, read-only data/text
>> files and/or more complex types such as archives, jars etc. Archives (zip,
>> tar and tgz/tar.gz files) are un-archived at the slave nodes.
>> So, from my understanding of the API docs I expect that the Config.zip
>> file will be unzipped to Config directory and since I have SymLinked them
>> I can access the directory in the following manner from my map function:
>> FileInputStream fin = new FileInputStream("Config/file1.config");
>> But I get the FileNotFoundException on the execution of this statement.
>> Please let me know where I am going wrong.
>> Thanks,
>> Akhil

View raw message