hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Barak Yaish <barak.ya...@gmail.com>
Subject FileNotFoundExcepion when getting files from DistributedCache
Date Thu, 22 Nov 2012 20:34:18 GMT
Hi,

I’ve 2 nodes cluster (v1.04), master and slave. On the master, in
Tool.run() we add two files to the DistributedCache using addCacheFile().
Files do exist in HDFS. In the Mapper.setup() we want to retrieve those
files from the cache using FSDataInputStream fs = FileSystem.get(
context.getConfiguration() ).open( path ). The problem is that for one file
a FileNotFoundException is thrown, although the file exists on the slave
node:

attempt_201211211227_0020_m_000000_2: java.io.FileNotFoundException: File
does not exist:
/somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/analytics/1.csv

ls –l on the slave:

[hduser@slave ~]$ ll
/somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/
analytics/1.csv
-rwxr-xr-x 1 hduser hadoop 42701 Nov 22 10:18
/somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/analytics/1.csv
[hduser@slave ~]$

My questions are:

   1. Shouldn't all files exist on all nodes?
   2. What should be done to fix that?

Thanks.

Mime
View raw message