mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gokhan Capan <gkhn...@gmail.com>
Subject Re: Reuters Example in Windows&Cygwin
Date Mon, 16 Sep 2013 13:39:30 GMT
I believe you can install it separately, without having reinstall Cygwin

Sent from my iPhone

On Sep 16, 2013, at 15:30, Darius Miliauskas
<dariui.miliauskui@gmail.com> wrote:

> Thanks, Gokham, I needed to install "curl" additionally by running Cygwin
> installer again (choosing not to skip "curl" which was skipped by default).
>
> 1.
> I got:
>
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ ./cluster-reuters.sh
> Please select a number to choose the corresponding clustering algorithm
> 1. kmeans clustering
> 2. fuzzykmeans clustering
> 3. dirichlet clustering
> 4. lda clustering
> 5. minhash clustering
> Enter your choice : 1
> ok. You chose 1 and we'll use kmeans Clustering
> creating work directory at /tmp/mahout-work-DARIUS
> Extracting Reuters
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/util/ProgramDriver
>        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.util.ProgramDriver
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>        ... 1 more
> Converting to Sequence Files from Directory
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/util/ProgramDriver
>        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.util.ProgramDriver
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>        ... 1 more
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin64/usr/local/mahout/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory]
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/util/ProgramDriver
>        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:105)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.util.ProgramDriver
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>        ... 1 more
>
> 2. then I set path of hadoop using GUI of Windows:
> "C:\cygwin64\usr\local\hadoop\bin". And got the following output:
>
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ ./cluster-reuters.sh
> Please select a number to choose the corresponding clustering algorithm
> 1. kmeans clustering
> 2. fuzzykmeans clustering
> 3. dirichlet clustering
> 4. lda clustering
> 5. minhash clustering
> Enter your choice : 1
> ok. You chose 1 and we'll use kmeans Clustering
> creating work directory at /tmp/mahout-work-DARIUS
> Extracting Reuters
> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> cygwin warning:
>  MS-DOS style path detected: /usr/local/bin/C:\Program
>  Preferred POSIX equivalent is: /usr/local/bin/C:/Program
>  CYGWIN environment variable option "nodosfilewarning" turns off this
> warning.
>  Consult the user's guide for more details about POSIX paths:
>    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> /usr/local/hadoop/bin/hadoop: line 434: C:\Program Files\Java\jdk1.7.0_05;
> C:\Pr
>                                         ogram
> Files\Java\jdk1.7.0_05\bin/bin/java: No such file or directory
> /usr/local/hadoop/bin/hadoop: line 434: exec: C:\Program
> Files\Java\jdk1.7.0_05;
>                                                            C:\Program
> Files\Java\jdk1.7.0_05\bin/bin/java: cannot execute: No such file or
>
>                              directory
> Converting to Sequence Files from Directory
> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> /usr/local/hadoop/bin/hadoop: line 434: C:\Program Files\Java\jdk1.7.0_05;
> C:\Program Files\Java\jdk1.7.0_05\bin/bin/java: No such file or directory
> /usr/local/hadoop/bin/hadoop: line 434: exec: C:\Program
> Files\Java\jdk1.7.0_05;
>                                                            C:\Program
> Files\Java\jdk1.7.0_05\bin/bin/java: cannot execute: No such file or
>
>                              directory
> Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB: /usr/local/mahout/mahout-examples-0.8-job.jar
> /usr/local/hadoop/bin/hadoop: line 350: C:\Program: command not found
> /usr/local/hadoop/bin/hadoop: line 434: C:\Program Files\Java\jdk1.7.0_05;
> C:\Program Files\Java\jdk1.7.0_05\bin/bin/java: No such file or directory
> /usr/local/hadoop/bin/hadoop: line 434: exec: C:\Program
> Files\Java\jdk1.7.0_05;
>                                                            C:\Program
> Files\Java\jdk1.7.0_05\bin/bin/java: cannot execute: No such file or
>
>                              directory
>
> Actually, the files (from reuters) are downloaded as you can see:
>
> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
> $ pwd
> /usr/local/mahout/examples/bin
>
> DARIUS@DARIUS-PC ~
> $ cd /tmp
>
> DARIUS@DARIUS-PC /tmp
> $ ls
> hsperfdata_DARIUS  mahout-work-DARIUS
>
> DARIUS@DARIUS-PC /tmp
> $ cd mahout-work-DARIUS/
>
> DARIUS@DARIUS-PC /tmp/mahout-work-DARIUS
> $ ls
> reuters21578.tar.gz  reuters-sgm
>
> DARIUS@DARIUS-PC /tmp/mahout-work-DARIUS
> $ ls reuters-sgm/
> all-exchanges-strings.lc.txt  all-topics-strings.lc.txt
> README.txt     reut2-003.sgm  reut2-007.sgm  reut2-011.sgm  reut2-015.sgm
> reut2-019.sgm
> all-orgs-strings.lc.txt       cat-descriptions_120396.txt
> reut2-000.sgm  reut2-004.sgm  reut2-008.sgm  reut2-012.sgm  reut2-016.sgm
> reut2-020.sgm
> all-people-strings.lc.txt     feldman-cia-worldfactbook-data.txt
> reut2-001.sgm  reut2-005.sgm  reut2-009.sgm  reut2-013.sgm  reut2-017.sgm
> reut2-021.sgm
> all-places-strings.lc.txt     lewis.dtd
> reut2-002.sgm  reut2-006.sgm  reut2-010.sgm  reut2-014.sgm  reut2-018.sgm
>
> Anyway, I do not get any clustering. So, where is the problem?
>
>
> Best,
>
> Darius
>
>
> 2013/9/13 Gokhan Capan <gkhncpn@gmail.com>
>
>> You need to have 'curl' installed, as the error message tells.
>>
>> Gokhan
>>
>>
>> On Fri, Sep 13, 2013 at 2:37 PM, Darius Miliauskas <
>> dariui.miliauskui@gmail.com> wrote:
>>
>>> Dear All,
>>>
>>> I tried to run Reuters Example on my Windows machine (Windows 7), using
>>> Cygwin, but got the following error:
>>>
>>> DARIUS@DARIUS-PC /usr/local/mahout/examples/bin
>>> $ ./cluster-reuters.sh
>>> Please select a number to choose the corresponding clustering algorithm
>>> 1. kmeans clustering
>>> 2. fuzzykmeans clustering
>>> 3. dirichlet clustering
>>> 4. lda clustering
>>> 5. minhash clustering
>>> Enter your choice : 2
>>> ok. You chose 2 and we'll use fuzzykmeans Clustering
>>> creating work directory at /tmp/mahout-work-DARIUS
>>> Downloading Reuters-21578
>>> ./cluster-reuters.sh: line 80: curl: command not found
>>> Failed to download reuters
>>>
>>> How can I solve this problem?
>>>
>>>
>>> Best,
>>>
>>> Darius
>>

Mime
View raw message