hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Venner <jason.had...@gmail.com>
Subject Re: Distributed cache - are files unique per job?
Date Tue, 29 Sep 2009 14:30:55 GMT
When you use the commandline -archives
a directory "archives" is created in hdfs under the the per job submission
area, to store the archives.
So there should be no collisions, as long as no other job tracker is using
the same system directory path (conf.get("mapred.system.dir",
"/tmp/hadoop/mapred/system")) in your hdfs.

On Tue, Sep 29, 2009 at 2:55 AM, Erik Forsberg <forsberg@opera.com> wrote:

> Hi!
> If I distribute files using the Distributed Cache (-archives option),
> are they guaranteed to be unique per job, or is there a risk that if I
> distribute a file named A with job 1, job 2 which also distributes a
> file named A will read job 1's file?
> I think they are unique per job, just want to verify that.
> Thanks,
> \EF
> --
> Erik Forsberg <forsberg@opera.com>
> Developer, Opera Software - http://www.opera.com/

Pro Hadoop, a book to guide you from beginner to hadoop mastery,
www.prohadoopbook.com a community for Hadoop Professionals

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message