hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1568) TrackerDistributedCacheManager should clean up cache in a background thread
Date Wed, 28 Apr 2010 12:37:35 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861789#action_12861789

Amareshwari Sriramadasu commented on MAPREDUCE-1568:

bq. I also notice this in the code. It is guarded by cachedArchives everywhere in the code.
But why locking the individual cachestatus is not enough?
This was the decision made in MAPREDUCE-1098 when we separated global lock and individual
cacheStatus lock to avoid race between deleteCache and getLocalCache.

Looked at the patch. I just have one comment:
In the testcase, checkCacheDeletion waits for 5 minutes. It seems very high. This might result
into test timeouts because checkCacheDeletion is done twice in the test. I would say 30 second
wait is good enough for failing the test, because thread sleep time is set to 100 milliseconds.
To be strict, 200msec wait is also fine.

Other changes in the patch look very good.

> TrackerDistributedCacheManager should clean up cache in a background thread
> ---------------------------------------------------------------------------
>                 Key: MAPREDUCE-1568
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1568
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>             Fix For: 0.22.0
>         Attachments: MAPREDUCE-1568-v2.1.txt, MAPREDUCE-1568-v2.txt, MAPREDUCE-1568-v3.txt,
> Right now the TrackerDistributedCacheManager do the clean up with the following code
> {code}
> TaskRunner.run() -> 
> TrackerDistributedCacheManager.setup() ->
> TrackerDistributedCacheManager.getLocalCache() -> 
> TrackerDistributedCacheManager.deleteCache()
> {code}
> The deletion of the cache files can take a long time and it should not be done by a task.
We suggest that there should be a separate thread checking and clean up the cache files.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message