hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sharad Agarwal (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4766) Hadoop performance degrades significantly as more and more jobs complete
Date Thu, 15 Jan 2009 05:36:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664003#action_12664003
] 

Sharad Agarwal commented on HADOOP-4766:
----------------------------------------

ok I looked it into detail and discussed this with Devaraj. It seems that in finalize(), the
lock on Jobtracker needs to be acquired to cleanup the datastructures. As finalize() is called
by the GC thread, it would block the GC thread till it gets the lock on JobTracker. The lock
on Jobtracker is course grained and we don't want to block the GC on it. So my earlier suggested
approach of using finalize() won't work out.

> Hadoop performance degrades significantly as more and more jobs complete
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-4766
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4766
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.2, 0.19.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4766-v1.patch, HADOOP-4766-v2.10.patch, HADOOP-4766-v2.4.patch,
HADOOP-4766-v2.6.patch, HADOOP-4766-v2.7-0.18.patch, HADOOP-4766-v2.7-0.19.patch, HADOOP-4766-v2.7.patch,
HADOOP-4766-v2.8-0.18.patch, HADOOP-4766-v2.8-0.19.patch, HADOOP-4766-v2.8.patch, map_scheduling_rate.txt
>
>
> When I ran the gridmix 2 benchmark load on a fresh cluster of 500 nodes with hadoop trunk,

> the gridmix load, consisting of 202 map/reduce jobs of various sizes, completed in 32
minutes. 
> Then I ran the same set of the jobs on the same cluster, yhey completed in 43 minutes.
> When I ran them the third times, it took (almost) forever --- the job tracker became
non-responsive.
> The job  tracker's heap size was set to 2GB. 
> The cluster is configured to keep up to 500 jobs in memory.
> The job tracker kept one cpu busy all the time. Look like it was due to GC.
> I believe the release 0.18/0.19 have the similar behavior.
> I believe 0.18 and 0.18 also have the similar behavior.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message