hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1213) TaskTrackers restart is very slow because ti deletes distributed cache directory synchronously
Date Fri, 13 Nov 2009 11:47:40 GMT
TaskTrackers restart is very slow because ti deletes distributed cache directory synchronously
----------------------------------------------------------------------------------------------

                 Key: MAPREDUCE-1213
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1213
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 0.20.1
            Reporter: dhruba borthakur


We are seeing that when we restart a tasktracker, it tries to recursively delete all the file
in the distributed cache. It invoked FileUtil.fullyDelete() which is very very slow. This
means that the TaskTracker cannot join the cluster for an extended period of time (upto 2
hours for us). The problem is acute if the number of files in a distributed cache is a few-thousands.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message