spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Ash <and...@andrewash.com>
Subject Re: Calling external classes added by sc.addJar needs to be through reflection
Date Mon, 19 May 2014 07:26:25 GMT
Sounds like the problem is that classloaders always look in their parents
before themselves, and Spark users want executors to pick up classes from
their custom code before the ones in Spark plus its dependencies.

Would a custom classloader that delegates to the parent after first
checking itself fix this up?


On Mon, May 19, 2014 at 12:17 AM, DB Tsai <dbtsai@stanford.edu> wrote:

> Hi Sean,
>
> It's true that the issue here is classloader, and due to the classloader
> delegation model, users have to use reflection in the executors to pick up
> the classloader in order to use those classes added by sc.addJars APIs.
> However, it's very inconvenience for users, and not documented in spark.
>
> I'm working on a patch to solve it by calling the protected method addURL
> in URLClassLoader to update the current default classloader, so no
> customClassLoader anymore. I wonder if this is an good way to go.
>
>   private def addURL(url: URL, loader: URLClassLoader){
>     try {
>       val method: Method =
> classOf[URLClassLoader].getDeclaredMethod("addURL", classOf[URL])
>       method.setAccessible(true)
>       method.invoke(loader, url)
>     }
>     catch {
>       case t: Throwable => {
>         throw new IOException("Error, could not add URL to system
> classloader")
>       }
>     }
>   }
>
>
>
> Sincerely,
>
> DB Tsai
> -------------------------------------------------------
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Sun, May 18, 2014 at 11:57 PM, Sean Owen <sowen@cloudera.com> wrote:
>
> > I might be stating the obvious for everyone, but the issue here is not
> > reflection or the source of the JAR, but the ClassLoader. The basic
> > rules are this.
> >
> > "new Foo" will use the ClassLoader that defines Foo. This is usually
> > the ClassLoader that loaded whatever it is that first referenced Foo
> > and caused it to be loaded -- usually the ClassLoader holding your
> > other app classes.
> >
> > ClassLoaders can have a parent-child relationship. ClassLoaders always
> > look in their parent before themselves.
> >
> > (Careful then -- in contexts like Hadoop or Tomcat where your app is
> > loaded in a child ClassLoader, and you reference a class that Hadoop
> > or Tomcat also has (like a lib class) you will get the container's
> > version!)
> >
> > When you load an external JAR it has a separate ClassLoader which does
> > not necessarily bear any relation to the one containing your app
> > classes, so yeah it is not generally going to make "new Foo" work.
> >
> > Reflection lets you pick the ClassLoader, yes.
> >
> > I would not call setContextClassLoader.
> >
> > On Mon, May 19, 2014 at 12:00 AM, Sandy Ryza <sandy.ryza@cloudera.com>
> > wrote:
> > > I spoke with DB offline about this a little while ago and he confirmed
> > that
> > > he was able to access the jar from the driver.
> > >
> > > The issue appears to be a general Java issue: you can't directly
> > > instantiate a class from a dynamically loaded jar.
> > >
> > > I reproduced it locally outside of Spark with:
> > > ---
> > >     URLClassLoader urlClassLoader = new URLClassLoader(new URL[] { new
> > > File("myotherjar.jar").toURI().toURL() }, null);
> > >     Thread.currentThread().setContextClassLoader(urlClassLoader);
> > >     MyClassFromMyOtherJar obj = new MyClassFromMyOtherJar();
> > > ---
> > >
> > > I was able to load the class with reflection.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message