hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4018) limit memory usage in jobtracker
Date Mon, 08 Sep 2008 06:24:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629078#action_12629078
] 

Amar Kamat commented on HADOOP-4018:
------------------------------------

More comments
1) Computing allocated tasks based on internal datastructures of {{JobInProgress}} is incorrect
as their count will always be missing. {{garbageCollect()}} for the job frees up the memory
taken up by the caches. Note that {{garabgeCollect()}} will be invoked before {{retireJob()}}.


 I think we should use {{numMapTasks}} and {{numReduceTasks}} instead of the sizes of the
caches. Also reset the values of {{numMapTasks/numReduceTasks}} if the limit is crossed. If
someone introduces partial expansion of tasks then we can think about it later.

2) I am not sure if changing the log level of one class is sufficient. It would be nice if
we can set the log level of the complete framework in the testcase. But again is it required?

3) Did not check the test case completely but should take care of the following cases
    - submit multiple jobs such that all of them should be accommodated. Both while the previous
ones are running and also when the previous jobs are cleaned up.
    - submit some small jobs and then a large job that exceeds the limit. Submit a small job
after a limit-crossing job and make sure that the job gets accepted. The reason is to test
if the cleanup is done and there is no side effect of large job being submitted and getting
rejected
    - submit two jobs back to back such that {{job1.totalTasks() + job2.totalTasks() >
limit}}. Hence the first one should be accepted and the next one should be rejected.
    - anything else?

4) I dont see in the test case where the {{mapred.jobtracker.completeuserjobs.maximum}} is
set. By default it will be 100 and the test case might never test the cleanup process.

> limit memory usage in jobtracker
> --------------------------------
>
>                 Key: HADOOP-4018
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4018
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: maxSplits.patch, maxSplits2.patch, maxSplits3.patch, maxSplits4.patch,
maxSplits5.patch, maxSplits6.patch, maxSplits7.patch
>
>
> We have seen instances when a user submitted a job with many thousands of mappers. The
JobTracker was running with 3GB heap, but it was still not enough to prevent memory trashing
from Garbage collection; effectively the Job Tracker was not able to serve jobs and had to
be restarted.
> One simple proposal would be to limit the maximum number of tasks per job. This can be
a configurable parameter. Is there other things that eat huge globs of memory in job Tracker?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message