mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhengguo 'Mike' SUN <zhengguo...@yahoo.com>
Subject Re: LanczosSolver and ClassNotFoundException
Date Mon, 21 Feb 2011 18:43:50 GMT
But the mising class org.apache.mahout.math.Vector is in the mahout-math jar and I have already
packaged it under /lib of my own jar. Also, my little experiemnt showed that there is no problem
to access the Vector class in Mappers. Thus, I tend to think this may not be a dependency
problem. The exception was thrown when Hadoop tried to read the SequenceFile that has the
Vector class as the value. Is there any difference between accessing a class in the Mapper
(my little experiment) and accessing a class as the value of a SequenceFile in RecordReader?


 
From: Sean Owen <srowen@gmail.com>
To: user@mahout.apache.org; Zhengguo 'Mike' SUN <zhengguosun@yahoo.com>
Sent: Monday, February 21, 2011 1:00 PM
Subject: Re: LanczosSolver and ClassNotFoundException

You shouldn't have to modify the Hadoop environment, no. You just have
to roll all the dependencies into your job jar file. You want to use
Mahout's ".job" file which contains all of its dependencies. Merge it
with your classes and use that.

On Mon, Feb 21, 2011 at 5:52 PM, Zhengguo 'Mike' SUN
<zhengguosun@yahoo.com> wrote:
> Hi Lokendra,
>
> The thing is that I am using a shared cluter, which I don't have control on the environment.
I can only attch the needed jars in my own jar.
>
>
> From: Lokendra Singh <lsingh.969@gmail.com>
> To: user@mahout.apache.org; Zhengguo 'Mike' SUN <zhengguosun@yahoo.com>
> Sent: Monday, February 21, 2011 11:31 AM
> Subject: Re: LanczosSolver and ClassNotFoundException
>
> Hi,
>
> If you are mainly facing problems with ClassNotFound in Hadoop Environment,
> I would suggest you to put all the reqd (including mahout) jars in
> HADOOP_CLASSPATH in '$HADOOP_HOME/conf/hadoop-env.sh'. Also, while running
> the MR job, make sure that $HADOOP_HOME/conf exists in your classpath.
>
> Regards
> Lokendra
>
> On Mon, Feb 21, 2011 at 9:50 PM, Zhengguo 'Mike' SUN
> <zhengguosun@yahoo.com>wrote:
>
>> Hi All,
>>
>> I was playing with the LanczosSolver class in Mahout. What I did is copying
>> the code in TestDistributedLanczosSolver.java and trying to run it in a
>> shared cluster. I also packaged core, core-test, math, math-test, and
>> mahout-collection 5 jars under the lib/ directory of my own jar. This new
>> jar worked correctly on my local machine under Hadoop's local mode. When I
>> submitted it to the cluster, I got ClassNotFoundException when running the
>> TimesSquaredJob. The stack trace is as follow:
>>
>> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
>> at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:247)
>> at
>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:866)
>> at org.apache.hadoop.io.WritableName.getClass(WritableName.java:71)
>> at
>> org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:1613)
>> at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1555)
>> at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1428)
>> at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1417)
>> at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1412)
>> at
>> org.apache.hadoop.mapred.SequenceFileRecordReader.(SequenceFileRecordReader.java:43)
>> at
>> org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:63)
>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:338)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>
>> I also wrote a simple MapReduce job to test if I can access the Vector
>> class with some naive code like the following:
>>
>> Vector v = new DenseVector(100);
>> v.assign(3.14);
>>
>> This job worked fine in the cluster. Thus, it seemed that it is not the
>> problem to reference the Vector class. What could be wrong if it is not a
>> dependence problem?
>>
>>
>>
>>
>
>
>


      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message