hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-437) JobTracker may need to close its filesystem when being terminated
Date Sat, 12 Feb 2011 09:08:57 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993863#comment-12993863

Steve Loughran commented on MAPREDUCE-437:

Reviewing the code in trunk, the problem is a bit more serious and relates to what happens
when a cached FS instance is closed: everyone who has a reference to that instance cannot
use the filesystem. 

this does not normally surface in production as the JT runs in its own VM. It does exist in
MiniMR clusters, in testing, but hasn't shown up because nobody other than me has tried to
shut down an FS instance while the JT is still live.

Proposed actions
 1-rename this issue to be more explicit: JT must ask for a new FS instance and close it when
 2-add a test to verify that a miniMR cluster will fail if you get the same instance and close
 3-have the JT get a new instance on startup/going live and verify that test 2 now passes
 4-have the JT close its filesystem on shutdown, set its local reference to null
I can't think of an easy way to test #4 unless there is a method to get the JT filesystem

> JobTracker may need to close its filesystem when being terminated
> -----------------------------------------------------------------
>                 Key: MAPREDUCE-437
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-437
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Steve Loughran
>            Priority: Minor
> This is something I've been experimenting with HADOOP-3268; I'm not sure what the right
action is here.
> -currently, the JobTracker does not close() its filesystem when it is shut down. This
will cause it to leak filesystem references if JobTrackers are started and stopped in the
same process.
> -The TestMRServerPorts test explicitly closes the filesystem
>         jt.fs.close();
>         jt.stopTracker();
> -If you move the close() operation into the stopTracker()/terminate logic, the filesystem
gets cleaned up, but 
> TestRackAwareTaskPlacement and TestMultipleLevelCaching fail with a FilesystemClosed
error (stack traces to follow)
> Should the JobTracker close its filesystem whenever it is terminated? If so, there are
some tests that need to be reworked slightly to not expect the fileystem to be live after
the jobtracker is taken down.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message