hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2119) JobTracker becomes non-responsive if the task trackers finish task too fast
Date Wed, 06 Feb 2008 20:07:08 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566309#action_12566309
] 

Arun C Murthy commented on HADOOP-2119:
---------------------------------------

bq. I think it makes most sense to go with Option 1 for now, as it's the easiest to implement
and makes the most common case run much faster. Options 3 and 4 need a fair bit of refactoring
and may be an overkill for now, since you can get the most bang for the buck by just making
sure that you don't scan the array from the beginning for virgin tasks.

Vivek, it's a fair analysis and I agree it will help in the short-run.

However, I do believe this is a good time to start thinking about a better overall approach
- especially given that HADOOP-1985 (rack-aware Map-Reduce scheduling) is almost upon us ...

I've had a brief chat with Owen about this and we both seem to have different approaches -
I'll try and put up my thoughts about a completely revamped design for the scheduling data-structures
in the next few days for consideration.

> JobTracker becomes non-responsive if the task trackers finish task too fast
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-2119
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2119
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Critical
>             Fix For: 0.17.0
>
>         Attachments: hadoop-2119.patch, hadoop-jobtracker-thread-dump.txt
>
>
> I ran a job with 0 reducer on a cluster with 390 nodes.
> The mappers ran very fast.
> The jobtracker lacks behind on committing completed mapper tasks.
> The number of running mappers displayed on web UI getting bigger and bigger.
> The jos tracker eventually stopped responding to web UI.
> No progress is reported afterwards.
> Job tracker is running on a separate node.
> The job tracker process consumed 100% cpu, with vm size 1.01g (reach the heap space limit).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message