hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Diego Ceccarelli <diego.ceccare...@gmail.com>
Subject Re: problem using getLocalCacheArchives in DistributeCache
Date Thu, 19 May 2011 10:50:33 GMT
Dear all,

I finally solved the Distribute Cache issue using  symlink:
Before launching the jobs I put:

//activate symlink
DistributedCache.createSymlink(jobConf);
URI archiveUri = new URI(hdfsArchivePath+"#symbolicName");
DistributedCache.addCacheArchive(archiveUri, jobConf);


Then in the jobs I used:

URL resource = jobConf.getResource("#symbolicName");

Now, "resource" contains the path of the directory where the
archive is locally decompressed.
Hope it helps.

Best,
Diego









On Mon, May 16, 2011 at 11:00 PM, Diego Ceccarelli
<diego.ceccarelli@gmail.com> wrote:
> Hi all,
> I'm trying to distribute locally a MapFile using Hadoop's Distribute Cache.
> As The Definitive Guide suggests, since MapFiles are a collection of files
> with a defined directory structure, I zipped it and I copied in the hdfs:
>
> bin/hadoop fs -copyFromLocal mapfile.zip /user/myuser/myproject/
>
> and I tried to use the DistributedCache to send a copy of the mapfile
> to each node (as explained in [1]). So I set
>
> DistributedCache.addCacheArchive(new
> Path("/user/myuser/myproject/mapfile.zip").toUri(), jobConf);
>
> and then in the reduce step I put:
>
> Path[] files = DistributedCache.getLocalCacheArchives(conf);
>
> this retrieves the path of the zipped file on the local node, while,
> according to [1].
> i expected to find the extracted archive:
>
> "DistributedCache can be used to distribute simple, read-only
> data/text files and/or more complex types such as archives, jars etc.
> Archives (zip, tar and tgz/tar.gz files) are un-archived at the slave
> nodes."
>
> I also tried to unzip the file but at the expected path I always do
> not find the files that should be there.
> Does anyone know where I mistake? Could anyone show me a bunch of code
> to locally access file
> within an archive?
>
> Thanks in advance!
> Diego
>
>
> [1] http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/filecache/DistributedCache.html
>



-- 
Computers are useless. They can only give you answers.
(Pablo Picasso)
_______________
Diego Ceccarelli
High Performance Computing Laboratory
Information Science and Technologies Institute (ISTI)
Italian National Research Council (CNR)
Via Moruzzi, 1
56124 - Pisa - Italy

Phone: +39 050 315 3055
Fax: +39 050 315 2040
________________________________________

Mime
View raw message