hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5285) JobTracker hangs for long periods of time
Date Mon, 23 Feb 2009 15:18:02 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675920#action_12675920

Hudson commented on HADOOP-5285:

Integrated in Hadoop-trunk #763 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/763/])
    . Adding a file that I missed in my earlier commit.
. Fixes the issues - (1) obtainTaskCleanupTask checks whether job is inited before trying
to lock the JobInProgress (2) Moves the CleanupQueue class outside the TaskTracker and makes
it a generic class that is used by the JobTracker also for deleting the paths on the job's
output fs. (3) Moves the references to completedJobStore outside the block where the JobTracker
is locked. Contributed by Devaraj Das.

> JobTracker hangs for long periods of time
> -----------------------------------------
>                 Key: HADOOP-5285
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5285
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Vinod K V
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.20.0, 0.21.0
>         Attachments: 5285.1.patch, 5285.patch, trace.txt
> On one of the larger clusters of 2000 nodes, JT hanged quite often, sometimes for times
in the order of 10-15 minutes and once for one and a half hours(!). The stack trace shows
that JobInProgress.obtainTaskCleanupTask() is waiting for lock on JobInProgress object which
JobInProgress.initTasks() is holding for a long time waiting for DFS operations.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message