hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-3546) TaskTracker re-initialization gets stuck in cleaning up
Date Tue, 17 Jun 2008 04:35:46 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Devaraj Das updated HADOOP-3546:
--------------------------------

    Status: Open  (was: Patch Available)

There is a race condition in the cleanup thread due to which the thread may never exit (a
case where the interrupt is sent by the main thread but the cleanup thread is just about to
do tasksToCleanup.take(); hence the interrupt is lost, and the cleanup thread will stay in
take() for ever). Although, we could handle the problem by introducing additional synchronization,
I'd suggest that we remove the join for the threads and instead make the threads run as daemons.
I am nervous about putting lot of code for synchronization to handle the case where files
are left over on tasktracker exit. Since, the tasktracker, at startup, does the cleanup anyway,
we should be ok. 

> TaskTracker re-initialization gets stuck in cleaning up
> -------------------------------------------------------
>
>                 Key: HADOOP-3546
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3546
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: patch-3546.txt
>
>
> If TaskTracker gets reinit action, it is stuck in joining task cleanup thread. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message