hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2119) JobTracker becomes non-responsive if the task trackers finish task too fast
Date Mon, 17 Mar 2008 08:09:24 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579343#action_12579343

Amar Kamat commented on HADOOP-2119:

The attached patch does the following
Maps :
1) Replaces {{ArrayList}} with {{LinkedList}} for the currently used caches (call it *NR*
2) Failed TIPs are added (if it can be) at the front of the *NR* caches. [for fail-early]
3) Removal of a tip from the *NR* caches is on demand i.e remove running/non-runnable TIPs
while searching for a new TIP.
4) Maintains a new set of caches called *R* caches for running TIPs. This caches are similar
to the *NR* caches but provides faster removal. Additions to the caches are in the form of
appends. Removal is one shot i.e a non-running TIP is removed at once from all the *R* caches.
[for speculation]

Reduces :
1) Maintains a LinkedList of non-running reducers i.e *NR* cache. [for non-running tasks]
2) Failed reducers are added to the front of *NR* cache. [for fail-early]
3) Maintains a set of running reducers with faster removal capability. [for speculation]
1) Search preference is as follows {{FAILED}}, {{NON-RUNNING}}, {{RUNNING}}
2) Search order is as follows 
    1. Search local cache i.e strong locality
    2. Search bottom-up (i.e from the node's parent to the node's top level ancestor) for
a TIP i.e weak locality.
    3. Search breadth wise across top-level ancestors for a TIP i.e for a non local TIP.
3) Introducing a _default-node_. TIP's that are not local to any of the node are local to
default node. This node takes care of random-writer like cases i.e adapting the random-writer
like cases to the cache structure. _default-node_ belongs to _default-rack_ and hence all
the nodes share the non-local TIPs through _default-rack_.
4) The JobTracker need not be synchronized for providing reports to the JobClient and hence
these API's doesn't lock the JT. Some staleness is okay.
5) Commits are now in batches. But batching takes fixed number of tasks at a time. Default
is 5000. So at a time 5000 tasks will be batch committed. The reason for doing this 'fixed
sized batching' is that committing too many TIPs in one go locks the JobTracker for a very
long duration causing *lost rpc/tracker* issues.
6) TIPs use trackers hostname instead of tracker name for maintaining the list of machines
where the TIP failed.
7) One major bottleneck which we observed was in {{JobInProgress.isJobComplete()}} where all
the TIPs were scanned. This is costly since {{isJobComplete()}} is called once every completed/failed
task (via {{TaskCommit}} thread) and proves costly in case of large number of maps. Now this
check is done by using the counts of finished TIPs.

> JobTracker becomes non-responsive if the task trackers finish task too fast
> ---------------------------------------------------------------------------
>                 Key: HADOOP-2119
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2119
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Critical
>             Fix For: 0.17.0
>         Attachments: HADOOP-2119-v4.1.patch, hadoop-2119.patch, hadoop-jobtracker-thread-dump.txt
> I ran a job with 0 reducer on a cluster with 390 nodes.
> The mappers ran very fast.
> The jobtracker lacks behind on committing completed mapper tasks.
> The number of running mappers displayed on web UI getting bigger and bigger.
> The jos tracker eventually stopped responding to web UI.
> No progress is reported afterwards.
> Job tracker is running on a separate node.
> The job tracker process consumed 100% cpu, with vm size 1.01g (reach the heap space limit).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message