hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-1213) TaskTrackers restart is very slow because it deletes distributed cache directory synchronously
Date Tue, 15 Dec 2009 20:06:18 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zheng Shao updated MAPREDUCE-1213:
----------------------------------

    Attachment: MAPREDUCE-1213.4.patch

Changed function name to moveAndDeleteFromEachVolume.

AsyncDelete may have a different meaning - users might still see the files when the function
returns. This code actually moves the file first.


> TaskTrackers restart is very slow because it deletes distributed cache directory synchronously
> ----------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1213
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1213
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: dhruba borthakur
>            Assignee: Zheng Shao
>         Attachments: MAPREDUCE-1213.1.patch, MAPREDUCE-1213.2.patch, MAPREDUCE-1213.3.patch,
MAPREDUCE-1213.4.patch
>
>
> We are seeing that when we restart a tasktracker, it tries to recursively delete all
the file in the distributed cache. It invoked FileUtil.fullyDelete() which is very very slow.
This means that the TaskTracker cannot join the cluster for an extended period of time (upto
2 hours for us). The problem is acute if the number of files in a distributed cache is a few-thousands.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message