hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sangjin Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10893) isolated classloader on the client side
Date Mon, 28 Jul 2014 20:25:39 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14076761#comment-14076761

Sangjin Lee commented on HADOOP-10893:

I have posted the patch for using the isolated classloader on the client side. I've tested
it with a simple test driver (I'll post it once the jenkins goes through the current patch)
to verify that the user code and its dependencies are loaded through the application classloader,
and hadoop can load different versions of the same dependencies than the user dependencies.

Some key points about the patch:
- I have moved org.apache.hadoop.yarn.util.ApplicationClassLoader to hadoop-common so it can
be used by the client-side: ApplicationClassLoader is good enough for the client too
- the feature is enabled by setting an environment variable: this is in keeping with the USER_CLASSPATH_FIRST
- it also has the system classes, which can be overridden via an environment variable

It turns out to be bit simpler than I initially expected. The situation is pretty similar
to (but not entirely the same as) the YarnChild case.

I've also tested a real job submission of a fairly small app with several dependencies.

i'd love to hear feedback on the patch. Thanks!

> isolated classloader on the client side
> ---------------------------------------
>                 Key: HADOOP-10893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10893
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: util
>    Affects Versions: 2.4.0
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>         Attachments: HADOOP-10893.patch
> We have the job classloader on the mapreduce tasks that run on the cluster. It has a
benefit of being able to isolate class space for user code and avoid version clashes.
> Although it occurs less often, version clashes do occur on the client JVM. It would be
good to introduce an isolated classloader on the client side as well to address this. A natural
point to introduce this may be through RunJar, as that's how most of hadoop jobs are run.

This message was sent by Atlassian JIRA

View raw message