hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer <awittena...@linkedin.com>
Subject Re: Distributed cache - are files unique per job?
Date Tue, 29 Sep 2009 17:10:36 GMT



On 9/29/09 2:55 AM, "Erik Forsberg" <forsberg@opera.com> wrote:
> If I distribute files using the Distributed Cache (-archives option),
> are they guaranteed to be unique per job, or is there a risk that if I
> distribute a file named A with job 1, job 2 which also distributes a
> file named A will read job 1's file?

>From my understanding, at one point in time there was a 'shortcut' in the
system that did exactly what you fear.  If the same cache file name was
specified by multiple jobs, they'd get the same file as it was assumed they
were the same file.  I *think* this has been fixed though.

[Needless to say, for automated jobs that push security keys through a cache
file, this is bad.]


Mime
View raw message