Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-dev@hadoop.apache.org
Message-ID: <1555204519.1232016719924.JavaMail.jira@brutus>
Date: Thu, 15 Jan 2009 02:51:59 -0800 (PST)
From: "Amar Kamat (JIRA)" <jira@apache.org>
To: core-dev@hadoop.apache.org
Subject: [jira] Commented: (HADOOP-4766) Hadoop performance degrades
 significantly as more and more jobs complete
In-Reply-To: <1471723864.1228347704179.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HADOOP-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664085#action_12664085 ] 

Amar Kamat commented on HADOOP-4766:
------------------------------------

@koji:
Using tasks as a unit of memory usage is very tricky. Ideally we would require a memory model that can will help us derive memory requirements per task/tip/job etc. Until we have a memory model in place I think its better to go with the current solution as we only care about the overall memory used.

@Sharad: 
Using soft references might be a better solution and might work well. But I think it will be a major change in the framework and might be filed as an improvement. Since this issue is more of a bug fix, I think we should go ahead and use the current approach. Memory bottleneck with the JobTracker is separately tracked in HADOOP-4974. 

@Arun: 
I am waiting for your input on the comments made [here|https://issues.apache.org/jira/browse/HADOOP-4766?focusedCommentId=12663114#action_12663114].

> Hadoop performance degrades significantly as more and more jobs complete
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-4766
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4766
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.2, 0.19.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: HADOOP-4766-v1.patch, HADOOP-4766-v2.10.patch, HADOOP-4766-v2.4.patch, HADOOP-4766-v2.6.patch, HADOOP-4766-v2.7-0.18.patch, HADOOP-4766-v2.7-0.19.patch, HADOOP-4766-v2.7.patch, HADOOP-4766-v2.8-0.18.patch, HADOOP-4766-v2.8-0.19.patch, HADOOP-4766-v2.8.patch, map_scheduling_rate.txt
>
>
> When I ran the gridmix 2 benchmark load on a fresh cluster of 500 nodes with hadoop trunk, 
> the gridmix load, consisting of 202 map/reduce jobs of various sizes, completed in 32 minutes. 
> Then I ran the same set of the jobs on the same cluster, yhey completed in 43 minutes.
> When I ran them the third times, it took (almost) forever --- the job tracker became non-responsive.
> The job  tracker's heap size was set to 2GB. 
> The cluster is configured to keep up to 500 jobs in memory.
> The job tracker kept one cpu busy all the time. Look like it was due to GC.
> I believe the release 0.18/0.19 have the similar behavior.
> I believe 0.18 and 0.18 also have the similar behavior.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.