hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-4987) TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks
Date Fri, 15 Mar 2013 17:38:14 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Chris Nauroth updated MAPREDUCE-4987:

    Attachment: MAPREDUCE-4987.1.patch

I'm attaching a patch.  This fixes the issue of symlink handling on Windows by copying the
files instead of truly symlinking, similar to the approach taken in prior patches like HADOOP-9061.
 This also fixes the logic for bundling the classpath into a jar manifest by guaranteeing
that localized resources get added to the classpath, even if those localized resource don't
exist in the container path yet.  (The classpath jar must get created before the container
launch script runs to symlink or copy files from filecache, so this was a chicken-and-egg
problem.)  With these changes in place, {{TestMRJobs#testDistributedCache}} passes on Mac
and Windows.

Here is a summary of the changes in each file:

{{FileUtil#createJarWithClassPath}} - Accept environment provided by caller, because YARN
will construct an environment different from the current system environment.  Provide a way
to maintain a classpath entry with a trailing '/' even though the directory doesn't exist,
because the container launch script hasn't run yet.

{{TestFileUtil#testCreateJarWithClassPath}} - Change test to cover new logic.

{{TestMRJobs}} - Initialize {{MiniDFSCluster}} in a @BeforeClass method instead of a static
initialization block.  This test uses an inner class, {{DistributedCacheChecker}}, as the
job's mapper.  Since this is an inner class, it has a back-reference to the {{TestMRJobs}}
class.  This means that the {{TestMRJobs}} static initialization runs for each mapper task
in addition to running in the JUnit runner.  Therefore, this would start multiple instances
of {{MiniDFSCluster}} pointing at the same directories, which would sometimes cause deadlocks.
 Moving the initialization to a @BeforeClass method prevents it from running in the mappers.
 I also needed to add a special check that a path is a symlinked directory, because {{FileUtils#isSymlink}}
does not work as expected on Windows.

{{ContainerLaunch}} - Copy files instead of symlinking on Windows.  Guarantee that localized
resources get added to the classpath correctly, even if the paths do not exist yet.

> TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks
> ---------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-4987
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4987
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: distributed-cache, nodemanager
>    Affects Versions: 3.0.0
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: MAPREDUCE-4987.1.patch
> On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while checking
the length of a symlink.  It expects to see the length of the target of the symlink, but Java
6 on Windows always reports that a symlink has length 0.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message