hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1568) TrackerDistributedCacheManager should clean up cache in a background thread
Date Tue, 27 Apr 2010 10:23:34 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861350#action_12861350

Amareshwari Sriramadasu commented on MAPREDUCE-1568:

Functionally looks good. Some comments on the patch:
* Shall we rename the class BaseDir to something like BaseDirManager?
* cacheStatus.baseDir is being accessed under global lock in BaseDir.checkAndCleanup() method,
every-where else it is accessed under CacheStatus lock. I think it is no harm. Can you also
check this once and add a comment?
* Though not related to this issue, can you add a comment for CacheStatus.refCount saying
it should be always accessed under the global cachedArchives lock.
* catch(InterruptedException) block in CleanupThread can do a return instead of ignoring the
* I think cleanup thread should catch all Exceptions (not just IOException) because thread
should never crash, similar to CleanupQueue.
* Test case can call baseDir.checkAndCleanup() inline to avoid timing issues, what do you

> TrackerDistributedCacheManager should clean up cache in a background thread
> ---------------------------------------------------------------------------
>                 Key: MAPREDUCE-1568
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1568
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>             Fix For: 0.22.0
>         Attachments: MAPREDUCE-1568-v2.1.txt, MAPREDUCE-1568-v2.txt, MAPREDUCE-1568.txt
> Right now the TrackerDistributedCacheManager do the clean up with the following code
> {code}
> TaskRunner.run() -> 
> TrackerDistributedCacheManager.setup() ->
> TrackerDistributedCacheManager.getLocalCache() -> 
> TrackerDistributedCacheManager.deleteCache()
> {code}
> The deletion of the cache files can take a long time and it should not be done by a task.
We suggest that there should be a separate thread checking and clean up the cache files.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message