hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1302) TrackerDistributedCacheManager can delete file asynchronously
Date Tue, 29 Dec 2009 20:39:29 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795143#action_12795143
] 

Zheng Shao commented on MAPREDUCE-1302:
---------------------------------------

The code didn't create a single task for toBeDeleted, but went through the toBeDeleted directory
and create one task per each.
The reason for that is:
1. This allows parallel deletion of the contents inside toBeDeleted
2. A single list call per volume shouldn't take too long
3. If we want to create a single task for toBeDeleted, then we need to rename it to something
else, and recreate toBeDeleted, and then move the old one to be a sub directory inside the
new toBeDeleted. This will introduce additional intermediate states that may be hard to recover
from.


> TrackerDistributedCacheManager can delete file asynchronously
> -------------------------------------------------------------
>
>                 Key: MAPREDUCE-1302
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1302
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.20.2, 0.21.0, 0.22.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: MAPREDUCE-1302.0.patch, MAPREDUCE-1302.1.patch, MAPREDUCE-1302.2.patch
>
>
> With the help of AsyncDiskService from MAPREDUCE-1213, we should be able to delete files
from distributed cache asynchronously.
> That will help make task initialization faster, because task initialization calls the
code that localizes files into the cache and may delete some other files.
> The deletion can slow down the task initialization speed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message