Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0E97D102A3 for ; Thu, 19 Sep 2013 15:59:09 +0000 (UTC) Received: (qmail 79523 invoked by uid 500); 19 Sep 2013 15:59:03 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 78744 invoked by uid 500); 19 Sep 2013 15:59:00 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 78692 invoked by uid 99); 19 Sep 2013 15:58:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Sep 2013 15:58:58 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: error (athena.apache.org: local policy) Received: from [209.85.220.42] (HELO mail-pa0-f42.google.com) (209.85.220.42) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Sep 2013 15:58:54 +0000 Received: by mail-pa0-f42.google.com with SMTP id lj1so9875849pab.1 for ; Thu, 19 Sep 2013 08:58:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:content-transfer-encoding:message-id:references:to; bh=S3sVRHpbadJrrPfpwT/NozpAtHovAXsvjG99AWDx1NU=; b=B4eZL4XPW0QXFSuEEh+X9dN9agx9yHj69kaz6O8FCFrzQT4wMGIDLzarNFy4HVMnrj CTkMGVt53PEmSrUWY6VBdlOCfOm/rX5S1ld0KsBk8EdoobLRwptNXuu3dQGy7bDtZ3GV ahqgGEQwgjAVxmkGtCkw0yH8W1eijg+rruUW7/8+el+tihUhWBTz7k4Rvq4ntTSRxWZF 18O23mLIF9qJ8r4uD89NT5fO0bDkI3m9prYVN7vM7dQmpTvBm6FetGxmPcKty6l+GgfC 57gxJsvjKGDfQGHcKn35A7eR/vOVQv0IUVz9iTWiG8ZyG654EyX3lCULauxfXWW3fkKd HVvA== X-Gm-Message-State: ALoCoQmwYCn1VXNJ4epYBkXS/vX9kYdjyVq3Z7RMW2721GrFz1fLyifTpXqF3yv+/yE/joRlILrN X-Received: by 10.66.158.196 with SMTP id ww4mr3588162pab.57.1379606294133; Thu, 19 Sep 2013 08:58:14 -0700 (PDT) Received: from [192.168.0.5] ([63.142.207.22]) by mx.google.com with ESMTPSA id a5sm10156664pbw.4.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 19 Sep 2013 08:58:13 -0700 (PDT) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Subject: Re: Reuters Example in Windows&Cygwin From: Pat Ferrel In-Reply-To: Date: Thu, 19 Sep 2013 08:58:01 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <9AE9774F-1211-43A6-8CC9-6131A08F7C67@occamsmachete.com> References: <-4415351657030917164@unknownmsgid> <5237488B.6070006@wyona.com> To: user@mahout.apache.org X-Mailer: Apple Mail (2.1508) X-Virus-Checked: Checked by ClamAV on apache.org These look like hadoop errors, probably setup errors. Have you followed = the Windows hadoop setup procedure and tested it separately from Mahout = to verify it is working properly first? You may want to try the hadoop = mailing list and look for a cygwin expert. Trying to run this stack on Windows will make your life a little more = difficult because cygwin is not quite unix. Can you create a Virtual = machine and install a linux version in it? If so at least the standard = installs should work out of the box. Sorry but Windows experts are = getting harder to find on the mailing lists--I'm certainly not one. =20 On Sep 19, 2013, at 5:55 AM, Darius Miliauskas = wrote: To add, I tried the described solution " = http://stackoverflow.com/questions/13074368/java-lang-classnotfoundexcepti= on-org-apache-hadoop-util-programdriver". The version of mahout is 0.8. I tried it by adding (worth to check the personal case of the paths accordingly, $MAHOUT_HOME should be set as = well, in my case it is "C:\cygwin64\usr\local\mahout"): CLASSPATH=3D${CLASSPATH}:$MAHOUT_HOME/lib/hadoop/hadoop-core-1.1.2.jar at the end of the section in the file "mahout" (), so, the part looks = like this # add release dependencies to CLASSPATH for f in $MAHOUT_HOME/lib/*.jar; do CLASSPATH=3D${CLASSPATH}:$f; done else CLASSPATH=3D${CLASSPATH}:$MAHOUT_HOME/math/target/classes CLASSPATH=3D${CLASSPATH}:$MAHOUT_HOME/core/target/classes CLASSPATH=3D${CLASSPATH}:$MAHOUT_HOME/integration/target/classes CLASSPATH=3D${CLASSPATH}:$MAHOUT_HOME/examples/target/classes #CLASSPATH=3D${CLASSPATH}:$MAHOUT_HOME/core/src/main/resources CLASSPATH=3D${CLASSPATH}:$MAHOUT_HOME/lib/hadoop/hadoop-core-1.1.2.jar fi However, I still get the same error. Ciao, Darius 2013/9/18 Darius Miliauskas > Thanks, Michael. I looked more deeper at "cluster-reuters.sh", and = tried > to play with paths in System variables. I set $HADOOP_HOME > as "C:\cygwin64\usr\local\hadoop", and I got: >=20 > DARIUS@DARIUS-PC /usr/local/mahout/examples/bin > $ ./build-reuters.sh > Please call cluster-reuters.sh directly next time. This file is going > away. > Please select a number to choose the corresponding clustering = algorithm > 1. kmeans clustering > 2. fuzzykmeans clustering > 3. dirichlet clustering > 4. lda clustering > 5. minhash clustering > Enter your choice : 1 > ok. You chose 1 and we'll use kmeans Clustering > creating work directory at /tmp/mahout-work-DARIUS > cygwin warning: > MS-DOS style path detected: C:\cygwin64\usr\local\hadoop/bin/hadoop > Preferred POSIX equivalent is: /usr/local/hadoop/bin/hadoop > CYGWIN environment variable option "nodosfilewarning" turns off this > warning. > Consult the user's guide for more details about POSIX paths: > http://cygwin.com/cygwin-ug-net/using.html#using-pathnames > Extracting Reuters > hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, = running > locally > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples- >=20 > 0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1. >=20 > 7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory] > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/util/ProgramDriver > at > org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.util.ProgramDriver >=20 > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) > at = sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) > ... 1 more > Copying Reuters data to Hadoop > Warning: $HADOOP_HOME is deprecated. >=20 > C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command = not > found > rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-sgm: No such file = or > directory. > Warning: $HADOOP_HOME is deprecated. >=20 > C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command = not > found > rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-out: No such file = or > directory. > Warning: $HADOOP_HOME is deprecated. >=20 > C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command = not > found > put: File /tmp/mahout-work-DARIUS/reuters-sgm does not exist. >=20 > There is the piece of the code in "cluster-reuters.sh" which use that > value: >=20 > if [ "$HADOOP_HOME" !=3D "" ] && [ "$MAHOUT_LOCAL" =3D=3D "" ] ; then > HADOOP=3D"$HADOOP_HOME/bin/hadoop" > if [ ! -e $HADOOP ]; then > echo "Can't find hadoop in $HADOOP, exiting" > exit 1 > fi > fi >=20 > So, I reset my $HADOOP_HOME as "C:/cygwin64/usr/local/hadoop", and = then > ran again, and I got: >=20 > DARIUS@DARIUS-PC /usr/local/mahout/examples/bin > $ ./build-reuters.sh > Please call cluster-reuters.sh directly next time. This file is going > away. > Please select a number to choose the corresponding clustering = algorithm > 1. kmeans clustering > 2. fuzzykmeans clustering > 3. dirichlet clustering > 4. lda clustering > 5. minhash clustering > Enter your choice : 1 > ok. You chose 1 and we'll use kmeans Clustering > creating work directory at /tmp/mahout-work-DARIUS > Extracting Reuters > hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, = running > locally > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8 > job.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > = [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j= /impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory] > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/hadoop/util/ProgramDriver > at > org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.util.ProgramDriver > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) > at = sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) > ... 1 more > Copying Reuters data to Hadoop > Warning: $HADOOP_HOME is deprecated. >=20 > C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command = not > found > rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-sgm: No such file = or > directory. > Warning: $HADOOP_HOME is deprecated. >=20 > C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command = not > found > rmr: cannot remove /tmp/mahout-work-DARIUS/reuters-out: No such file = or > directory. > Warning: $HADOOP_HOME is deprecated. >=20 > C:\cygwin64\usr\local\hadoop/bin/hadoop: line 350: C:\Program: command = not > found > put: File /tmp/mahout-work-DARIUS/reuters-sgm does not exist. >=20 > Similar issue is described here ( > = http://stackoverflow.com/questions/13074368/java-lang-classnotfoundexcepti= on-org-apache-hadoop-util-programdriver). > So, it is odd that hadoop binary is not in the path while it should be > there. Missing the class "org/apache/hadoop/util/ProgramDriver" but it = is > in "C:\cygwin64\usr\local\hadoop\src\core\org\apache\hadoop\util". >=20 >=20 > Darius >=20 >=20 > 2013/9/17 Darius Miliauskas >=20 >> I guess there is some problems with the paths in Cygwin since I get = that >> output: >>=20 >> DARIUS@DARIUS-PC ~ >> cd .. >>=20 >> DARIUS@DARIUS-PC ~ >> cd >>=20 >> DARIUS@DARIUS-PC ~ >> $ cd /usr/local/mahout/examples/bin >>=20 >> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin >> $ ./build-reuters.sh >> Please call cluster-reuters.sh directly next time. This file is = going >> away. >> Please select a number to choose the corresponding clustering = algorithm >> 1. kmeans clustering >> 2. fuzzykmeans clustering >> 3. dirichlet clustering >> 4. lda clustering >> 5. minhash clustering >> Enter your choice : 1 >> ok. You chose 1 and we'll use kmeans Clustering >> creating work directory at /tmp/mahout-work-DARIUS >> Extracting Reuters >> Running on hadoop, using /usr/local/hadoop/bin/hadoop and = HADOOP_CONF_DIR=3D >> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar >> cygwin warning: >> MS-DOS style path detected: /usr/local/bin/C:\Program >> Preferred POSIX equivalent is: /usr/local/bin/C:/Program >> CYGWIN environment variable option "nodosfilewarning" turns off this >> warning. >> Consult the user's guide for more details about POSIX paths: >> http://cygwin.com/cygwin-ug-net/using.html#using-pathnames >> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found >> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar >> Converting to Sequence Files from Directory >> Running on hadoop, using /usr/local/hadoop/bin/hadoop and = HADOOP_CONF_DIR=3D >> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar >> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found >> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar >> Running on hadoop, using /usr/local/hadoop/bin/hadoop and >> HADOOP_CONF_DIR=3D >> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar >> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found >> Not a valid JAR: C:\usr\local\mahout\mahout-examples-0.8-job.jar >>=20 >> How should I run the clustering then? >>=20 >>=20 >> Thanks, >>=20 >> Darius >>=20 >>=20 >> 2013/9/16 Michael Wechner >>=20 >>> Hi Darius >>>=20 >>> I think you need to try to understand why in your case certain = classes >>> are not being found. >>>=20 >>> I would suggest that you have a look at the reuters script and try = to >>> understand where exactly the problems >>> occur and then go deeper in order to find out the root of the = problem. >>>=20 >>> HTH >>>=20 >>> Michael >>>=20 >>> Am 16.09.13 17:10, schrieb Darius Miliauskas: >>>=20 >>> Caused by: java.lang.**ClassNotFoundException: >>>>>> org.apache.hadoop.util.**ProgramDriver >>>>>> at java.net.URLClassLoader$1.run(**URLClassLoader.java:366) >>>>>> at java.net.URLClassLoader$1.run(**URLClassLoader.java:355) >>>>>> at java.security.**AccessController.doPrivileged(**Native >>>> Method) >>>>>> at = java.net.URLClassLoader.**findClass(URLClassLoader.java:* >>>> *354) >>>>>> at = java.lang.ClassLoader.**loadClass(ClassLoader.java:**423) >>>>>> at sun.misc.Launcher$**AppClassLoader.loadClass(** >>>> Launcher.java:308) >>>>>> at = java.lang.ClassLoader.**loadClass(ClassLoader.java:**356) >>>>=20 >>>=20 >>>=20 >>=20 >=20