hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ferdy Galema <ferdy.gal...@kalooga.com>
Subject RunJar classloader issues
Date Fri, 09 Sep 2011 09:54:00 GMT
Sometimes when running hadoop jobs using the 'hadoop jar' command there 
are issues with the classloader. I presume these are caused by classes 
that are loaded BEFORE the commands main is invoced. For example, when 
relying on the MapWritable in the command, it is not possible to use a 
class that is not in the default idToClassMap. MapWritable.class is 
loaded before the user job is unpacked and therefore it's classloader 
will not be able to find custom classes. (At least classes that are in 
the RunJar it's classloader classpath).

I could not find any issues or discussion about this so I assume it is 
somewhat of an obscure issue (please correct me if I'm wrong). Anyway I 
would like to hear what you think of this and perhaps discuss a possible 
solution. Such as spawning the command in a new JVM. MapWritable or 
rather AbstractMapWritable uses a Class.forName(className) construction, 
maybe this can be changed so that uses the classloader of the current 
thread instead of its own class. (Will this work?)

A workaround for now is to explicitely put the jar itself on the 
classpath, i.e. 'env HADOOP_CLASSPATH=myJar hadoop jar myJar'.

View raw message