hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amareshwari Sri Ramadasu <amar...@yahoo-inc.com>
Subject Re: distributed cache question
Date Wed, 05 May 2010 04:36:15 GMT
Hi Mark,

You need to pass complete URL of the file on DFS for DistributedCache.addCacheFile.
Please see http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#DistributedCache
And http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/filecache/DistributedCache.html
for the usage.

Thanks
Amareshwari

On 5/5/10 4:22 AM, "Mark Tozzi" <mark.tozzi@gmail.com> wrote:

Hi all,

I've been tinkering with hadoop for some time, but am new to the
mailing list.  Please forgive me if this has already been asked and
answered.  I am attempting to use the Distributed Cache to allow my
map reduce job to access some lookup files.  I have the following code
to add the files to the distributed cache (showing only a single file
for brevity):

tmpPath = new Path(cl.getOptionValue("lookup_file"));
conf.set("lookupfileName", tmpPath.getName());
DistributedCache.addCacheFile(tmpPath.toUri(),conf);
System.out.println("added " + tmpPath.toUri().toString() + " as " +
tmpPath.getName() );

and the following code in the Mapper.setup method to access these files:

Path[] localFiles = DistributedCache.getLocalCacheFiles(conf);
for (Path file : localFiles) {
        if (file.getName().equals( conf.get("lookupfileName")) ){
                parser.registerResource("bad_uas", new FileReader(new
File( file.toUri())));
        }
        // further checks for other files in cache
}

this is generating the exception "java.lang.IllegalArgumentException:
URI is not absolute" when I attempt to instantiate the File object.
The registerResource method is currently designed to accept an
instance of a reader from which it pulls its information.  That method
is under my control, and I can reconfigure it to take a more
appropriate input if such exists.

I have tried a few variations on this specific method, and all seem to
come back to the "URI is not absolute" error.  What is the piece I am
missing here?

Thanks,

--Mark Tozzi


Mime
View raw message