hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Evans <ev...@yahoo-inc.com>
Subject Re: operation of DistributedCache following manual deletion of cached files?
Date Fri, 23 Sep 2011 13:35:01 GMT
Meng Mao,

The way the distributed cache is currently written, it does not verify the integrity of the
cache files at all after they are downloaded.  It just assumes that if they were downloaded
once they are still there and in the proper shape.  It might be good to file a JIRA to add
in some sort of check.  Another thing to do is that the distributed cache also includes the
time stamp of the original file, just incase you delete the file and then use a different
version.  So if you want it to force a download again you can copy it delete the original
and then move it back to what it was before.

--Bobby Evans

On 9/23/11 1:57 AM, "Meng Mao" <mengmao@gmail.com> wrote:

We use the DistributedCache class to distribute a few lookup files for our
jobs. We have been aggressively deleting failed task attempts' leftover data
, and our script accidentally deleted the path to our distributed cache
files.

Our task attempt leftover data was here [per node]:
/hadoop/hadoop-metadata/cache/mapred/local/
and our distributed cache path was:
hadoop/hadoop-metadata/cache/mapred/local/taskTracker/archive/<nameNode>
We deleted this path by accident.

Does this latter path look normal? I'm not that familiar with
DistributedCache but I'm up right now investigating the issue so I thought
I'd ask.

After that deletion, the first 2 jobs to run (which are use the addCacheFile
method to distribute their files) didn't seem to push the files out to the
cache path, except on one node. Is this expected behavior? Shouldn't
addCacheFile check to see if the files are missing, and if so, repopulate
them as needed?

I'm trying to get a handle on whether it's safe to delete the distributed
cache path when the grid is quiet and no jobs are running. That is, if
addCacheFile is designed to be robust against the files it's caching not
being at each job start.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message