hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2119) JobTracker becomes non-responsive if the task trackers finish task too fast
Date Tue, 11 Mar 2008 16:48:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577509#action_12577509

Devaraj Das commented on HADOOP-2119:

bq. The only problem is that of the reducer-scheduling from the JT. The maps finish so fast
that the map load is always low and the reducers always start after the maps are done. Simple
tricks of increasing the number of task completion events, jetty threads etc might help but
wont provide a scalable solution. So it seems that tweaking the load logic in the JT i.e getNewTaskForTaskTracker()
is the only way. 

The load logic seems to be there by design and is there even in the existing codebase. Since
the maps are really small and they complete really fast (even before the scheduled tasktracker
heartbeat interval), the tasktracker always reports with countMapTasks() = 0. Thus they always
get a map task. Increasing the number of taskcompletion events or the Jetty threads will not
help here since the reducers are not even launched. If we decide to tweak the load logic it
should be done as a separate Jira IMO. 

> JobTracker becomes non-responsive if the task trackers finish task too fast
> ---------------------------------------------------------------------------
>                 Key: HADOOP-2119
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2119
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Critical
>             Fix For: 0.17.0
>         Attachments: hadoop-2119.patch, hadoop-jobtracker-thread-dump.txt
> I ran a job with 0 reducer on a cluster with 390 nodes.
> The mappers ran very fast.
> The jobtracker lacks behind on committing completed mapper tasks.
> The number of running mappers displayed on web UI getting bigger and bigger.
> The jos tracker eventually stopped responding to web UI.
> No progress is reported afterwards.
> Job tracker is running on a separate node.
> The job tracker process consumed 100% cpu, with vm size 1.01g (reach the heap space limit).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message