hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amareshwari Sri Ramadasu <amar...@yahoo-inc.com>
Subject Re: Running into problems with DistributedCache
Date Fri, 16 Apr 2010 05:20:02 GMT
Looking at the following error :
2010-04-15 17:26:09,746 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201004151709_0001_m_000001_3:
java.io.IOException: Cannot open filename /srv/hadoop/data/hadoop/mapred/local/taskTracker/archive/hadoop-eventlog01.socialmedia.com/user/knuttycombe/socialmedia.mr_tool.serfile/c363f0f6-28ac-4365-ba93-fec6e5188741.ser/c363f0f6-28ac-4365-ba93-fec6e5188741.ser
 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1497)
 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1488)
 at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:376)
 at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178)
 at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356)
 at socialmedia.common.hadoop.DistCacheResources$$anonfun$init$2$$anonfun$apply$3.apply(DistCacheResources.scala:54)
 at socialmedia.common.hadoop.DistCacheResources$$anonfun$init$2$$anonfun$apply$3.apply(DistCacheResources.scala:54)
 at socialmedia.common.util.Util$.using(Util.scala:20)
 at socialmedia.common.hadoop.DistCacheResources$$anonfun$init$2.apply(DistCacheResources.scala:53)
 at socialmedia.common.hadoop.DistCacheResources$$anonfun$init$2.apply(DistCacheResources.scala:52)
 at scala.Option.map(Option.scala:70)
 at socialmedia.common.hadoop.DistCacheResources$class.init(DistCacheResources.scala:51)
 at socialmedia.somegra.reporting.SeriesMetricsMapper.init(HDFSMetricsQuery.scala:185)
 at socialmedia.somegra.reporting.SeriesMetricsMapper.setup(HDFSMetricsQuery.scala:192)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)

You map task is trying to read the file from DFS. This file will be in local FileSystem.

Your code below should create FileSystem using FileSystem.getLocal(conf) or you can directly
use java.io.File to access the file.
      resPath => using(resPath.getFileSystem(conf)) {
        fs => using(fs.open(resPath)) {
Hope this helps you.

Thanks
Amareshwari

On 4/15/10 11:36 PM, "Kris Nuttycombe" <kris.nuttycombe@gmail.com> wrote:

Hi, all,

I'm having problems with my Mapper instances accessing the
DistributedCache. A bit of background:

I'm running on a single-node cluster, just trying to get my first
map/reduce job functioning. Both the job tracker and the primary
namenode exist on the same host. In the client, I am able to
successfully add a file to the distributed cache, but when my Mapper
instance attempts to read the file it fails, despite the fact that the
path it fails on exists on the system where the job is running.

Here is a paste detailing the code where the error is occurring,
related log output from the node where the job runs, and filesystem
information from the same:

http://paste.pocoo.org/show/202242/

The failure appears to be originating from these lines in DFSClient.java

      LocatedBlocks newInfo = callGetBlockLocations(namenode, src, 0,
prefetchSize);
      if (newInfo == null) {
        throw new IOException("Cannot open filename " + src);
      }

I've attempted to trace back through the code to try to figure out why
newInfo might be null, but I quickly got lost. Can someone please help
me figure out why it can't find this file?

Thank you,

Kris


Mime
View raw message