mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suneel Marthi <suneel_mar...@yahoo.com>
Subject Re: Fwd: PCA with ssvd leads to StackOverFlowError
Date Wed, 05 Mar 2014 16:30:00 GMT
Not sure if the CDH4 patches on top of 0.7 has fixes for M-1067 and M-1098 which address the
issues u r seeing.



The second part of the issue u r seeing with Mahout 0.9 distro seems to be related to how
u set it up on CDH4. I apologize for not being helpful here as I am not a CDH4 user or expert.

Sean?




On Wednesday, March 5, 2014 10:23 AM, Kevin Moulart <kevinmoulart@gmail.com> wrote:
 
Previous mail sent only to Suneel : (my bad sorry)

According to my stacktrace it seems that I am running mahout 0.7 indeed.
> That's the version provided by Cloudera when I install mahout using yum.
> But according to Sean Owen, it really is a 0.8 inside...
> Anyway I tried with the compiled version and it didn't work :
> Running on hadoop, using /opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop
> and HADOOP_CONF_DIR=
> Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.hadoop.util.ProgramDriver.driver([Ljava/lang/String;)V
>  at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:122)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

MAHOUT-JOB:
> /home/cacf/Downloads/mahout-distribution-0.9/mahout-examples-0.9-job.jar
>

And now I changed the conf directory of mahout 0.9 to be linked to the one
used by the existing working mahout and the trace changes :

MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop
and HADOOP_CONF_DIR=/etc/hadoop/conf
MAHOUT-JOB:
/home/myCompany/Downloads/mahout-distribution-0.9/mahout-examples-0.9-job.jar
14/03/05 16:16:23 WARN driver.MahoutDriver: Unable to add class:
org.apache.mahout.clustering.meanshift.MeanShiftCanopyDriver
java.lang.ClassNotFoundException:
org.apache.mahout.clustering.meanshift.MeanShiftCanopyDriver
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:237)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:118)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
14/03/05 16:16:23 WARN driver.MahoutDriver: Unable to add class:
org.apache.mahout.clustering.spectral.eigencuts.EigencutsDriver
java.lang.ClassNotFoundException:
org.apache.mahout.clustering.spectral.eigencuts.EigencutsDriver
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:237)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:118)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
14/03/05 16:16:23 WARN driver.MahoutDriver: Unable to add class:
org.apache.mahout.clustering.minhash.MinHashDriver
java.lang.ClassNotFoundException:
org.apache.mahout.clustering.minhash.MinHashDriver
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:237)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:118)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
14/03/05 16:16:23 WARN driver.MahoutDriver: Unable to add class:
org.apache.mahout.clustering.dirichlet.DirichletDriver
java.lang.ClassNotFoundException:
org.apache.mahout.clustering.dirichlet.DirichletDriver
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:237)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:118)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.hadoop.util.ProgramDriver.driver([Ljava/lang/String;)V
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:122)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

Changing the hadoop home to
/opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop-mapreduce doesn't change
the output, nor does
/opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop-0.20-mapreduce

Any idea now ?



2014-03-05 15:45 GMT+01:00 Suneel Marthi <suneel_marthi@yahoo.com>:

Are u using Mahout 0.7 ?
>
> From this line in ur stacktrace that seems to be the case:
> MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7-cdh4.5.0-job.jar
>
> You could build Mahout outside of CDH from Mahout trunk and put the jars
> onto CDH5.
> I am no Cloudera expert or CDH5 user to help with CDHx build.
>
>
>
>
>
>
>   On Wednesday, March 5, 2014 9:30 AM, Kevin Moulart <
> kevinmoulart@gmail.com> wrote:
>  Hi and thanks for your help!
>
> I had been told that the version of mahout used by Cloudera (CDH 4.6) was
> in fact 0.8 with a patch for mr2 support.
> (
> http://mail-archives.apache.org/mod_mbox/mahout-user/201402.mbox/%3CCAEccTywqSAKA_HeX4vTZ-5XPmKtj5b8zMGQUfn5qRsiq=7o=ug@mail.gmail.com%3E)
>
> But I tried to install 0.9 on my own, by compiling it with mvn after I
> changed the pom.xml :
>
> - Added cloudera repository :
>
>     <repository>
>       <id>cloudera-repo</id>
>       <name>Cloudera Repository</name>
>        <url>https://repository.cloudera.com/artifactory/cloudera-repos
> </url>
>     </repository>
>
> - Changed the version of hadoop to use :
>     <hadoop.1.version>2.0.0-mr1-cdh4.6.0</hadoop.1.version>
> - I tried adding this one too :
>     <hadoop2.version>2.0.0-cdh4.6.0</hadoop2.version>
>
> But then I get a lot of errors when Maven begins to compile the core
> package :
> https://gist.github.com/kmoulart/9368193
>
> Could you tell me what I did wrong ?
>
>
> 2014-03-04 19:02 GMT+01:00 Suneel Marthi <suneel_marthi@yahoo.com>:
>
> The -us option was fixed for Mahout 0.8, seems like u r using Mahout 0.7
> which had this issue (from ur stacktrace, its apparent u r using Mahout
> 0.7).  Please upgrade to the latest mahout version.
>
>
>
>
>
> On Tuesday, March 4, 2014 8:54 AM, Kevin Moulart <kevinmoulart@gmail.com>
> wrote:
>
> Hi,
>
> I'm trying to apply a PCA to reduce the dimension of a matrix of 1603
> columns and 100.000 to 30.000.000 lines using ssvd with the pca option, and
> I always get a StackOverflowError :
>
> Here is my command line :
> mahout ssvd -i /user/myUser/Echant100k -o /user/myUser/Echant/SVD100 -k 100
> -pca "true" -U "false" -V "false" -t 3 -ow
>
> I also tried to put "-us true" as mentionned in
>
> https://cwiki.apache.org/confluence/download/attachments/27832158/SSVD-CLI.pdf?version=18&modificationDate=1381347063000&api=v2but
> the option is not available anymore.
>
> The output of the previous command is :
> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> Running on hadoop, using /opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop
> and HADOOP_CONF_DIR=/etc/hadoop/conf
> MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7-cdh4.5.0-job.jar
> 14/03/04 14:45:16 INFO common.AbstractJob: Command line arguments:
> {--abtBlockHeight=[200000], --blockHeight=[10000], --broadcast=[true],
> --computeU=[false], --computeV=[false], --endPhase=[2147483647],
> --input=[/user/myUser/Echant100k], --minSplitSize=[-1],
> --outerProdBlockHeight=[30000], --output=[/user/myUser/Echant/SVD100],
> --oversampling=[15], --overwrite=null, --pca=[true], --powerIter=[0],
> --rank=[100], --reduceTasks=[3], --startPhase=[0], --tempDir=[temp],
> --uHalfSigma=[false], --vHalfSigma=[false]}
> Exception in thread "main" java.lang.StackOverflowError
> at
>
> org.apache.mahout.math.hadoop.MatrixColumnMeansJob.run(MatrixColumnMeansJob.java:55)
> at
>
> org.apache.mahout.math.hadoop.MatrixColumnMeansJob.run(MatrixColumnMeansJob.java:55)
> at
>
> org.apache.mahout.math.hadoop.MatrixColumnMeansJob.run(MatrixColumnMeansJob.java:55)
> ...
>
> I search online and didn't find a solution to my problem.
>
> Can you help me ?
>
> Thanks in advance,
>
> --
> Kévin Moulart
>
>
>
>
> --
> Kévin Moulart
> GSM France : +33 7 81 06 10 10
> GSM Belgique : +32 473 85 23 85
> Téléphone fixe : +32 2 771 88 45
>
>
>


-- 
Kévin Moulart
GSM France : +33 7 81 06 10 10
GSM Belgique : +32 473 85 23 85
Téléphone fixe : +32 2 771 88 45
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message