mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adil Aijaz <a...@yahoo-inc.com>
Subject Re: ClassNotFoundException with pseudo/distributed run of KMeans
Date Thu, 16 Jul 2009 17:57:21 GMT
My basic understanding of the class loader stuff is:

1. Any jars that need to be available to map/reduce jobs should be 
specified through -libjars  (e.g hadoop --config ... -libjars gson.jar 
jar <path to my jar> ...)
2. Any jars that need to be available to the main class should be 
specified through lib/*.jar (that is in the 
mahout-examples-0.2-SNAPSHOT/lib/*.jar)

unless of course as Jeff is saying one ends up flattening the lib/*.jar 
into top level classes.

Adil

Jeff Eastman wrote:
> Isn't this the same old problem that our Job jar file has a lib 
> directory with the Mahout code in it and the way Hadoop loads the jar 
> it sometimes cannot resolve classes in it? IIRC, one needs to smash 
> the job jar file into a single jar in order for Dirichlet (at least, 
> and any other examples which contain non-core classes). I confess I do 
> not understand the class loader stuff enough to be more specific.
>
> I have duplicated the CNF exception by defining and using a 
> user-defined distance measure in the Job file and running KMeans with 
> it, so it is not specific to Dirichlet.
>
>
> classes
> Grant Ingersoll wrote:
>> Hmm, I'm not seeing the ClassNotFound problem but am getting fetch 
>> failures.  Will look later.
>>
>> -Grant
>>
>> On Jul 16, 2009, at 11:32 AM, Paul Ingles wrote:
>>
>>> I've just tried setting a brand new machine (Ubuntu 8.04 Virtual 
>>> Machine) with Hadoop 0.20.0 and running the compile jobs against it. 
>>> I get the same problems as before... still scratching my head :(
>>>
>>> On 16 Jul 2009, at 12:15, Paul Ingles wrote:
>>>
>>>> Sure,
>>>>
>>>> I'm running (currently) on my MacBook Air, running OSX Leopard.
>>>>
>>>> JDK: java version "1.6.0_13"
>>>> Java(TM) SE Runtime Environment (build 1.6.0_13-b03-211)
>>>> Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02-83, mixed mode)
>>>>
>>>> Hadoop is: 0.20.0, r763504
>>>>
>>>> I'm compiling mahout from trunk (r794023) as follows (in the root 
>>>> of the project directory):
>>>>
>>>> % mvn install
>>>> % hadoop jar examples/target/mahout-examples-0.2-SNAPSHOT.job 
>>>> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
>>>>
>>>> The only difference (for dirichlet) is the different class to run.
>>>>
>>>> Thanks,
>>>> Paul
>>>>
>>>> On 16 Jul 2009, at 11:33, Grant Ingersoll wrote:
>>>>
>>>>> Can you share how you built and how you are running, as in command 
>>>>> line options, etc.?  Also, JDK version, Hadoop version, etc.
>>>>>
>>>>> On Jul 16, 2009, at 6:21 AM, Paul Ingles wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Thank you for the suggestion. Unfortunately, when I tried that I

>>>>>> received the same error. I've also tried copying the gson jar 
>>>>>> directly into $HADOOP_HOME/lib (when I was running a single node

>>>>>> pseudo-distributed) and get the same error still.
>>>>>>
>>>>>> Weirdly enough, if I try and run the Dirichlet example on the 
>>>>>> cluster I receive another ClassNotFoundException:
>>>>>>
>>>>>> 09/07/16 10:27:54 INFO mapred.JobClient: Task Id : 
>>>>>> attempt_200907161026_0002_m_000001_0, Status : FAILED
>>>>>> java.lang.RuntimeException: Error in configuring object
>>>>>>     at 
>>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)

>>>>>>
>>>>>>     at 
>>>>>> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)

>>>>>>
>>>>>>     at 
>>>>>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)

>>>>>>
>>>>>>     at 
>>>>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:352)
>>>>>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>>>>>     at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>>>>> Caused by: java.lang.reflect.InvocationTargetException
>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>     at 
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

>>>>>>
>>>>>>     at 
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

>>>>>>
>>>>>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>     at 
>>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)

>>>>>>
>>>>>>     ... 5 more
>>>>>> Caused by: java.lang.RuntimeException: Error in configuring object
>>>>>>     at 
>>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)

>>>>>>
>>>>>>     at 
>>>>>> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)

>>>>>>
>>>>>>     at 
>>>>>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)

>>>>>>
>>>>>>     at 
>>>>>> org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>>>>>>     ... 10 more
>>>>>> Caused by: java.lang.reflect.InvocationTargetException
>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>     at 
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

>>>>>>
>>>>>>     at 
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

>>>>>>
>>>>>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>     at 
>>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)

>>>>>>
>>>>>>     ... 13 more
>>>>>> Caused by: java.lang.RuntimeException: 
>>>>>> java.lang.ClassNotFoundException: 
>>>>>> org.apache.mahout.clustering.syntheticcontrol.dirichlet.NormalScModelDistribution

>>>>>>
>>>>>>     at 
>>>>>> org.apache.mahout.clustering.dirichlet.DirichletMapper.getDirichletState(DirichletMapper.java:95)

>>>>>>
>>>>>>     at 
>>>>>> org.apache.mahout.clustering.dirichlet.DirichletMapper.configure(DirichletMapper.java:60)

>>>>>>
>>>>>>     ... 18 more
>>>>>> Caused by: java.lang.ClassNotFoundException: 
>>>>>> org.apache.mahout.clustering.syntheticcontrol.dirichlet.NormalScModelDistribution

>>>>>>
>>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
>>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:316)
>>>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:288)
>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
>>>>>>     at 
>>>>>> org.apache.mahout.clustering.dirichlet.DirichletDriver.createState(DirichletDriver.java:121)

>>>>>>
>>>>>>     at 
>>>>>> org.apache.mahout.clustering.dirichlet.DirichletMapper.getDirichletState(DirichletMapper.java:71)

>>>>>>
>>>>>>     ... 19 more
>>>>>>
>>>>>>
>>>>>> Hoping this sparks some other suggestions :)
>>>>>>
>>>>>> Thanks,
>>>>>> Paul
>>>>>>
>>>>>>
>>>>>> On Wed Jul 15 22:08:09 UTC 2009, Adil Aijaz <adil@yahoo-inc.com>

>>>>>> wrote:
>>>>>>> try hadoop --config <hod-cluster-dir> jar -libjars <path
to 
>>>>>>> gson.jar>
>>>>>>> <your job/jar file> <your class> <arguments>
>>>>>>>
>>>>>>> Adil
>>>>>>>
>>>>>>> Paul Ingles wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Apologies for the cross-posting (I also sent this to the
Hadoop 
>>>>>>>> user
>>>>>>>> list) but I'm still getting errors if I try and run the KMeans
>>>>>>>> examples on a cluster, whether that be my single-node Mac
Pro, 
>>>>>>>> or our
>>>>>>>> cluster. I've attached the stack trace at the bottom of the
email.
>>>>>>>>
>>>>>>>> The gson jar is definitely included in the packaged .job,
and 
>>>>>>>> is also
>>>>>>>> in the temporary directory when the task tracker picks up
the 
>>>>>>>> work.
>>>>>>>> The gson jar also includes TypeToken.class in the expected
path.
>>>>>>>>
>>>>>>>> Again, really appreciate people's help in getting this going!
>>>>>>>>
>>>>>>>> ----snip----
>>>>>>>> 09/07/15 17:06:38 INFO mapred.JobClient: Task Id :
>>>>>>>> attempt_200907151617_0010_m_000000_0, Status : FAILED
>>>>>>>> java.lang.NoClassDefFoundError: com/google/gson/reflect/TypeToken
>>>>>>>> at java.lang.ClassLoader.defineClass1(Native Method)
>>>>>>>> at java.lang.ClassLoader.defineClass(ClassLoader.java:703)
>>>>>>>> at
>>>>>>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)

>>>>>>>>
>>>>>>>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
>>>>>>>> at java.net.URLClassLoader.access$000(URLClassLoader.java:56)
>>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
>>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:319)
>>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:330)
>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:254)
>>>>>>>> at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:402)
>>>>>>>> at
>>>>>>>> org.apache.mahout.matrix.AbstractVector.asFormatString(AbstractVector.java:374)

>>>>>>>>
>>>>>>>>
>>>>>>>> at
>>>>>>>> org.apache.mahout.clustering.kmeans.Cluster.outputPointWithClusterInfo(Cluster.java:198)

>>>>>>>>
>>>>>>>>
>>>>>>>> at
>>>>>>>> org.apache.mahout.clustering.kmeans.KMeansClusterMapper.map(KMeansClusterMapper.java:39)

>>>>>>>>
>>>>>>>>
>>>>>>>> at
>>>>>>>> org.apache.mahout.clustering.kmeans.KMeansClusterMapper.map(KMeansClusterMapper.java:32)

>>>>>>>>
>>>>>>>>
>>>>>>>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>>>>>>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
>>>>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>>>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>> com.google.gson.reflect.TypeToken
>>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
>>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:319)
>>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:330)
>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:254)
>>>>>>>> at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:402)
>>>>>>>> ... 20 more
>>>>>>>> ----snip----
>>>>>>>>
>>>>>>>> Incidentally, as part of this work I've also implemented
a Pearson
>>>>>>>> distance measure, if people think it would be useful to be

>>>>>>>> folded in
>>>>>>>> I'd be happy to get the SVN patch with tests and implementation

>>>>>>>> together.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Paul
>>>>>
>>>>> --------------------------
>>>>> Grant Ingersoll
>>>>> http://www.lucidimagination.com/
>>>>>
>>>>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) 
>>>>> using Solr/Lucene:
>>>>> http://www.lucidimagination.com/search
>>>>>
>>>>
>>>
>>
>>
>>
>


Mime
View raw message