hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-1700) User supplied dependencies may conflict with MapReduce system JARs
Date Tue, 18 Sep 2012 11:46:10 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Tom White updated MAPREDUCE-1700:

    Attachment: MAPREDUCE-1700-ccl.patch

bq. Most of the class loader issues stem from long running containers that need to dynamically
load/unload classes.

Also, the case we are talking about does not have the complex classloader trees that app servers
have, so there are no sibling class sharing issues. In the task JVM there is only a single
user app, so the classloader hierarchy is linear (boot, extension, system, job).

There are a few cases where certain APIs make assumptions about which classloader to use:

* *The system classloader*. For example, URL stream handlers are loaded by the classloader
that loaded java.net.URL (boot), or the system classloader. So if a task registered a URL
stream handler and it was in the job JAR, then it wouldn't be found since it was loaded by
the job classloader, not the system classloader.  In this case, the workaround is to implement
a factory and call URL.setURLStreamHandlerFactory().
* *The caller's current classloader*. For example, java.util.ResourceBundle uses the caller's
current classloader, so if the framework tries to load a bundle then the bundle (e.g. a localization
bundle) would not be found if it were in the job JAR, since the system classloader (which
loaded the framework class) can't see the job classloader's classes. As it happens, MR counters
use resource bundles; however, they explicitly use the context classloader, so this problem
doesn't occur (see org.apache.hadoop.mapreduce.util.ResourceBundles). (Also, I imagine the
use of resource bundles to localize counter names in the job JAR is very rare.)
* *The context classloader*. For example, JAXP uses the context classloader to load the DocumentBuilderFactory
specified in a system property. This case is covered by setting the context classloader to
be the job classloader for the duration of the task (my latest patch does this). Most APIs
that involve classloaders use the context classloader these days.

So all of these cases can be handled. Also note that by default the job classloader is not
used, to enable it you need to set mapreduce.job.isolated.classloader to true for your job.

The latest patch handles the case of embedded lib and classes directories in the JAR, as well
as distributed cache files and archives. The unit test passes (and fails with a NoSuchMethodError
due to the class incompatibility if mapreduce.job.isolated.classloader is set to false). So
I think it is pretty close now - the main thing left to do is sort out the build for the test,
which relies on the MR examples module.

> User supplied dependencies may conflict with MapReduce system JARs
> ------------------------------------------------------------------
>                 Key: MAPREDUCE-1700
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1700
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: MAPREDUCE-1700-ccl.patch, MAPREDUCE-1700-ccl.patch, MAPREDUCE-1700.patch,
> If user code has a dependency on a version of a JAR that is different to the one that
happens to be used by Hadoop, then it may not work correctly. This happened with user code
using a different version of Avro, as reported [here|https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852081#action_12852081].
> The problem is analogous to the one that application servers have with WAR loading. Using
a specialized classloader in the Child JVM is probably the way to solve this.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message