mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject Re: Reuters Example in Windows&Cygwin
Date Thu, 19 Sep 2013 15:58:01 GMT
These look like hadoop errors, probably setup errors. Have you followed the Windows hadoop
setup procedure and tested it separately from Mahout to verify it is working properly first?
You may want to try the hadoop mailing list and look for a cygwin expert.

Trying to run this stack on Windows will make your life a little more difficult because cygwin
is not quite unix. Can you create a Virtual machine and install a linux version in it? If
so at least the standard installs should work out of the box. Sorry but Windows experts are
getting harder to find on the mailing lists--I'm certainly not one.

 
On Sep 19, 2013, at 5:55 AM, Darius Miliauskas <dariui.miliauskui@gmail.com> wrote:

To add, I tried the described solution "
http://stackoverflow.com/questions/13074368/java-lang-classnotfoundexception-org-apache-hadoop-util-programdriver".
The version of mahout is 0.8. I tried it by adding (worth to check the
personal case of the paths accordingly, $MAHOUT_HOME should be set as well,
in my case it is "C:\cygwin64\usr\local\mahout"):

CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/lib/hadoop/hadoop-core-1.1.2.jar

at the end of the section in the file "mahout" (), so, the part looks like
this

# add release dependencies to CLASSPATH
 for f in $MAHOUT_HOME/lib/*.jar; do
   CLASSPATH=${CLASSPATH}:$f;
 done
else
 CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/math/target/classes
 CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/core/target/classes
 CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/integration/target/classes
 CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/examples/target/classes
 #CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/core/src/main/resources
 CLASSPATH=${CLASSPATH}:$MAHOUT_HOME/lib/hadoop/hadoop-core-1.1.2.jar
fi

However, I still get the same error.


Ciao,

Darius


2013/9/18 Darius Miliauskas <dariui.miliauskui@gmail.com>

> Thanks, Michael. I looked more deeper at "cluster-reuters.sh", and tried
> to play with paths in System variables. I set $HADOOP_HOME
> as "C:\cygwin64\usr\local\hadoop", and I got:
> 
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ ./build-reuters.sh
> Please call cluster-reuters.sh directly next time.  This file is going
> away.
> Please select a number to choose the corresponding clustering algorithm
> 1. kmeans clustering
> 2. fuzzykmeans clustering
> 3. dirichlet clustering
> 4. lda clustering
> 5. minhash clustering
> Enter your choice : 1
> ok. You chose 1 and we'll use kmeans Clustering
> creating work directory at /tmp/mahout-work-DARIUS
> cygwin warning:
>  MS-DOS style path detected: C:\cygwin64\usr\local\hadoop/bin/hadoop
>  Preferred POSIX equivalent is: /usr/local/hadoop/bin/hadoop
>  CYGWIN environment variable option "nodosfilewarning" turns off this
> warning.
>  Consult the user's guide for more details about POSIX paths:
>    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
> Extracting Reuters
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-
> 
>                 0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.
> 
>                 7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/util/ProgramDriver
>        at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.util.ProgramDriver
> 
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>        ... 1 more
> Copying Reuters data to Hadoop
> Warning: $HADOOP_HOME is deprecated.
> 
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-sgm: No such file or
> directory.
> Warning: $HADOOP_HOME is deprecated.
> 
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-out: No such file or
> directory.
> Warning: $HADOOP_HOME is deprecated.
> 
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> put: File /tmp/mahout-work-DARIUS/reuters-sgm does not exist.
> 
> There is the piece of the code in "cluster-reuters.sh" which use that
> value:
> 
> if [ "$HADOOP_HOME" != "" ] && [ "$MAHOUT_LOCAL" == "" ] ; then
>  HADOOP="$HADOOP_HOME/bin/hadoop"
>  if [ ! -e $HADOOP ]; then
>    echo "Can't find hadoop in $HADOOP, exiting"
>    exit 1
>  fi
> fi
> 
> So, I reset my $HADOOP_HOME as "C:/cygwin64/usr/local/hadoop", and then
> ran again, and I got:
> 
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ ./build-reuters.sh
> Please call cluster-reuters.sh directly next time.  This file is going
> away.
> Please select a number to choose the corresponding clustering algorithm
> 1. kmeans clustering
> 2. fuzzykmeans clustering
> 3. dirichlet clustering
> 4. lda clustering
> 5. minhash clustering
> Enter your choice : 1
> ok. You chose 1 and we'll use kmeans Clustering
> creating work directory at /tmp/mahout-work-DARIUS
> Extracting Reuters
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8
> job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/util/ProgramDriver
>        at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.util.ProgramDriver
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>        ... 1 more
> Copying Reuters data to Hadoop
> Warning: $HADOOP_HOME is deprecated.
> 
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-sgm: No such file or
> directory.
> Warning: $HADOOP_HOME is deprecated.
> 
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-out: No such file or
> directory.
> Warning: $HADOOP_HOME is deprecated.
> 
> C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command not
> found
> put: File /tmp/mahout-work-DARIUS/reuters-sgm does not exist.
> 
> Similar issue is described here (
> http://stackoverflow.com/questions/13074368/java-lang-classnotfoundexception-org-apache-hadoop-util-programdriver).
> So, it is odd that hadoop binary is not in the path while it should be
> there. Missing the class "org/apache/hadoop/util/ProgramDriver" but it is
> in "C:\cygwin64\usr\local\hadoop\src\core\org\apache\hadoop\util".
> 
> 
> Darius
> 
> 
> 2013/9/17 Darius Miliauskas <dariui.miliauskui@gmail.com>
> 
>> I guess there is some problems with the paths in Cygwin since I get that
>> output:
>> 
>> DARIUS@DARIUS-PC ~
>> cd ..
>> 
>> DARIUS@DARIUS-PC ~
>> cd
>> 
>> DARIUS@DARIUS-PC ~
>> $ cd /usr/local/mahout/examples/bin
>> 
>> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
>> $ ./build-reuters.sh
>> Please call cluster-reuters.sh directly next time.  This file is going
>> away.
>> Please select a number to choose the corresponding clustering algorithm
>> 1. kmeans clustering
>> 2. fuzzykmeans clustering
>> 3. dirichlet clustering
>> 4. lda clustering
>> 5. minhash clustering
>> Enter your choice : 1
>> ok. You chose 1 and we'll use kmeans Clustering
>> creating work directory at /tmp/mahout-work-DARIUS
>> Extracting Reuters
>> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
>> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
>> cygwin warning:
>>  MS-DOS style path detected: /usr/local/bin/C:\Program
>>  Preferred POSIX equivalent is: /usr/local/bin/C:/Program
>>  CYGWIN environment variable option "nodosfilewarning" turns off this
>> warning.
>>  Consult the user's guide for more details about POSIX paths:
>>    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
>> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
>> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
>> Converting to Sequence Files from Directory
>> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
>> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
>> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
>> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
>> Running on hadoop, using /usr/local/hadoop/bin/hadoop and
>> HADOOP_CONF_DIR=
>> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
>> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
>> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar
>> 
>> How should I run the clustering then?
>> 
>> 
>> Thanks,
>> 
>> Darius
>> 
>> 
>> 2013/9/16 Michael Wechner <michael.wechner@wyona.com>
>> 
>>> Hi Darius
>>> 
>>> I think you need to try to understand why in your case certain classes
>>> are not being found.
>>> 
>>> I would suggest that you have a look at the reuters script and try to
>>> understand where exactly the problems
>>> occur and then go deeper in order to find out the root of the problem.
>>> 
>>> HTH
>>> 
>>> Michael
>>> 
>>> Am 16.09.13 17:10, schrieb Darius Miliauskas:
>>> 
>>> Caused by: java.lang.**ClassNotFoundException:
>>>>>> org.apache.hadoop.util.**ProgramDriver
>>>>>>       at java.net.URLClassLoader$1.run(**URLClassLoader.java:366)
>>>>>>       at java.net.URLClassLoader$1.run(**URLClassLoader.java:355)
>>>>>>       at java.security.**AccessController.doPrivileged(**Native
>>>> Method)
>>>>>>       at java.net.URLClassLoader.**findClass(URLClassLoader.java:*
>>>> *354)
>>>>>>       at java.lang.ClassLoader.**loadClass(ClassLoader.java:**423)
>>>>>>       at sun.misc.Launcher$**AppClassLoader.loadClass(**
>>>> Launcher.java:308)
>>>>>>       at java.lang.ClassLoader.**loadClass(ClassLoader.java:**356)
>>>> 
>>> 
>>> 
>> 
> 


Mime
View raw message