hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Chen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1568) TrackerDistributedCacheManager should do deleteLocalPath asynchronously
Date Fri, 23 Apr 2010 18:56:51 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860350#action_12860350

Scott Chen commented on MAPREDUCE-1568:

Hey Amareshwari,

deleteCache will first get the global lock of all cache and put the one needs with zero reference
count in toBeDeleted (this is done by you guys in MAPREDUCE-1098). And the asynchronous deletion
will start from there. 

When the deletion condition is valid, only one task will get the global lock and after it
comes out of the global lock the deletion condition will no longer valid. So there cannot
be two threads deleting same set of cache at the same moment.

  private void deleteCache(Configuration conf) throws IOException {
    Collection<CacheStatus> toBeDeleted = new LinkedList<CacheStatus>();
    synchronized (cachedArchives) {  // Global lock of all caches
    // Find cache Status with refcount of zero and put them in to toBeDeleted

    // do the deletion asynchronously, after releasing the global lock

A separate cleanup thread is another option. I think that will work fine as well. But that
will require more change. I think the good thing about the current patch is that it is simple
and safe.

> TrackerDistributedCacheManager should do deleteLocalPath asynchronously
> -----------------------------------------------------------------------
>                 Key: MAPREDUCE-1568
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1568
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>             Fix For: 0.22.0
>         Attachments: MAPREDUCE-1568.txt
> TrackerDistributedCacheManager.deleteCache() has been improved:
> MAPREDUCE-1302 makes TrackerDistributedCacheManager rename the caches in the main thread
and then delete them in the background 
> MAPREDUCE-1098 avoids global locking while do the renaming (renaming lots of directories
can also takes a long time)
> But the deleteLocalCache is still in the main thread of TaskRunner.run(). So it will
still slow down the task which triggers the deletion (originally this will blocks all tasks,
but it is fixed by MAPREDUCE-1098). Other tasks do not wait for the deletion. The task which
triggers the deletion should not wait for this either. TrackerDistributedCacheManager should
do deleteLocalPath() asynchronously.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message