hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Chen (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-1568) TrackerDistributedCacheManager should clean up cache in a background thread
Date Tue, 27 Apr 2010 21:37:34 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Scott Chen updated MAPREDUCE-1568:

    Attachment: MAPREDUCE-1568-v3.txt

Made the change based on Amareshwari's suggestions.

Shall we rename the class BaseDir to something like BaseDirManager?
That's a very good name. I have changed it.

cacheStatus.baseDir is being accessed under global lock in BaseDir.checkAndCleanup() method,
every-where else it is accessed under CacheStatus lock. I think it is no harm. Can you also
check this once and add a comment?
I have made baseDir immutable

Though not related to this issue, can you add a comment for CacheStatus.refCount saying it
should be always accessed under the global cachedArchives lock.
I have add comments on all fields in CacheStatus to explain how to access them.

catch(InterruptedException) block in CleanupThread can do a return instead of ignoring the
I think cleanup thread should catch all Exceptions (not just IOException) because thread should
never crash, similar to CleanupQueue.
Now it catches all exceptions, logs them and keeps running.

Test case can call baseDir.checkAndCleanup() inline to avoid timing issues, what do you think?
I feel that it is better to test the whole CleanupThread. That way we can make sure the thread
actually does the right thing.

> TrackerDistributedCacheManager should clean up cache in a background thread
> ---------------------------------------------------------------------------
>                 Key: MAPREDUCE-1568
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1568
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>             Fix For: 0.22.0
>         Attachments: MAPREDUCE-1568-v2.1.txt, MAPREDUCE-1568-v2.txt, MAPREDUCE-1568-v3.txt,
> Right now the TrackerDistributedCacheManager do the clean up with the following code
> {code}
> TaskRunner.run() -> 
> TrackerDistributedCacheManager.setup() ->
> TrackerDistributedCacheManager.getLocalCache() -> 
> TrackerDistributedCacheManager.deleteCache()
> {code}
> The deletion of the cache files can take a long time and it should not be done by a task.
We suggest that there should be a separate thread checking and clean up the cache files.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message