hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
Date Thu, 07 Nov 2013 18:11:18 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13816192#comment-13816192
] 

Chris Nauroth commented on MAPREDUCE-5508:
------------------------------------------

bq. Hope this fix is committed in branch-1, please share the revision of the commit.

http://svn.apache.org/viewvc?view=revision&revision=1497962
http://svn.apache.org/viewvc?view=revision&revision=1499904
http://svn.apache.org/viewvc?view=revision&revision=1525774

bq. I have noticed that heap size of Jobtracker is gradually increasing after the upgrade
also.

Just observing heap size wouldn't be sufficient to confirm or deny that the fix is in place.
 It's natural for the JVM to grow the heap as needed.  Incremental garbage collection will
clean that up gradually, and a full GC eventually would reclaim all unused space.  All of
this is too unpredictable to confirm or deny the memory leak.

We confirmed this fix by running various MapReduce workloads in a controlled environment,
running jmap on the JobTracker process to dump a memory map, and then viewing the dump with
jhat.  When the memory leak happens, you end up seeing {{DistributedFileSystem}} instances
that are only referenced from the internal {{HashMap}} of the {{FileSystem#Cache}}.  (With
no other reference to the instance, it means that no one is ever going to close it, and therefore
it will never get removed from the cache.)  With all of these patches applied, we see all
{{DistributedFileSystem}} instances are referenced from the {{FileSystem#Cache}} and also
some other references.

bq. Do I need to update any other patches/fix?

No, that's all of them.

If there are any additional questions, I recommend moving to the user@hadoop.apache.org mailing
list.

> JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5508
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>    Affects Versions: 1-win, 1.2.1
>            Reporter: Xi Fang
>            Assignee: Xi Fang
>            Priority: Critical
>             Fix For: 1-win, 1.3.0
>
>         Attachments: CleanupQueue.java, JobInProgress.java, MAPREDUCE-5508.1.patch, MAPREDUCE-5508.2.patch,
MAPREDUCE-5508.3.patch, MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object
(see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>           tempDirFs = jobTempDirPath.getFileSystem(conf);
>           CleanupQueue.getInstance().addToQueue(
>               new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>       try {
>         fs.close();
>       } catch (IOException ie) {
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message