mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank Scholten <fr...@frankscholten.nl>
Subject Re: Mahout 0.5 java.lang.IllegalStateException: No clusters found. Check your -c path.
Date Wed, 15 Feb 2012 19:42:40 GMT
You must either specify -k <number> to have kmeans randomly pick k
initial clusters from the input vectors or use -c to point to a
directory of initial clusters, generated by canopy for example.

2012/2/15 Qiang Xu <xxqonline@hotmail.com>:
>
> Note, this problem is only happen in hadoop cluster.Mahout Standalone modle is no such
problem.
>
>> From: xxqonline@hotmail.com
>> To: user@mahout.apache.org
>> Subject: RE: Mahout 0.5 java.lang.IllegalStateException: No clusters found. Check
your -c path.
>> Date: Wed, 15 Feb 2012 12:22:26 +0800
>>
>>
>> I have seen there is such problem in mainthread
>> http://lucene.472066.n3.nabble.com/jira-Created-MAHOUT-504-Kmeans-clustering-error-td1531052.html
>> and
>> https://issues.apache.org/jira/browse/MAHOUT-504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#issue-tabs
>>
>> But my step is following official guide.
>> https://cwiki.apache.org/MAHOUT/k-means-clustering.html
>>
>> Could you point out what should I do corrctly?
>> I have tried
>> ./bin/mahout kmeans -i
>>  examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/ -c
>>  examples/bin/work/clusters -o  examples/bin/work/reuters-kmeans -x 10
>>  -ow
>> ./bin/mahout kmeans -i
>>  examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/ -c
>>  examples/bin/work/clusters -o  examples/bin/work/reuters-kmeans -x 10
>>  -ow -cl
>> ./bin/mahout kmeans -i
>>  examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/ -c
>>  examples/bin/work/clusters -o  examples/bin/work/reuters-kmeans -x 10
>> -k 0 -ow
>> ./bin/mahout kmeans -i
>>  examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/ -c
>>  examples/bin/work/clusters -o  examples/bin/work/reuters-kmeans -x 10
>> -k 20 -ow
>> > Date: Tue, 14 Feb 2012 19:39:59 -0800
>> > Subject: Re: Mahout 0.5 java.lang.IllegalStateException: No clusters found.
Check your -c path.
>> > From: goksron@gmail.com
>> > To: user@mahout.apache.org
>> >
>> > See the other mail thread for the MAHOUT-504 JIRA. That jira is closed
>> > and fixed.
>> > The problem is that the program needs one of a few different
>> > combinations of arguments. It does not give you an error message
>> > describing the problem.
>> >
>> > On Tue, Feb 14, 2012 at 6:59 PM, Qiang Xu <xxqonline@hotmail.com> wrote:
>> > >
>> > > The new test is using command  ./bin/mahout kmeans -i  examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/
-c  examples/bin/work/clusters -o  examples/bin/work/reuters-kmeans -x 10  -ow -cl
>> > > Still the same problem.
>> > >
>> > >> From: xxqonline@hotmail.com
>> > >> To: user@mahout.apache.org
>> > >> Subject: RE: Mahout 0.5 java.lang.IllegalStateException: No clusters
found. Check your -c path.
>> > >> Date: Wed, 15 Feb 2012 10:58:25 +0800
>> > >>
>> > >>
>> > >> I have checked the command line:
>> > >> --clustering (-cl)                           If present,
run clustering after
>> > >>                                              
 the iterations have taken place
>> > >> And try it, it seems the same behavior, could you give me more clue?
>> > >> op_cluster/hadoop-0.20.2/
>> > >> HADOOP_CONF_DIR=/data/hadoop_cluster/hadoop-0.20.2/conf/
>> > >> 12/02/15 11:16:23 INFO common.AbstractJob: Command line arguments:
{--clustering=null, --clusters=examples/bin/work/clusters, --convergenceDelta=0.5, --distanceMeasure=org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure,
--endPhase=2147483647, --input=examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/,
--maxIter=10, --method=mapreduce, --output=examples/bin/work/reuters-kmeans, --overwrite=null,
--startPhase=0, --tempDir=temp}
>> > >> 12/02/15 11:16:23 INFO common.HadoopUtil: Deleting examples/bin/work/reuters-kmeans
>> > >> 12/02/15 11:16:23 INFO kmeans.KMeansDriver: Input: examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors
Clusters In: examples/bin/work/clusters Out: examples/bin/work/reuters-kmeans Distance: org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure
>> > >> 12/02/15 11:16:23 INFO kmeans.KMeansDriver: convergence: 0.5 max Iterations:
10 num Reduce Tasks: org.apache.mahout.math.VectorWritable Input Vectors: {}
>> > >> 12/02/15 11:16:23 INFO kmeans.KMeansDriver: K-Means Iteration 1
>> > >> 12/02/15 11:16:24 INFO input.FileInputFormat: Total input paths to
process : 1
>> > >> 12/02/15 11:16:24 INFO mapred.JobClient: Running job: job_201202131515_0126
>> > >> 12/02/15 11:16:25 INFO mapred.JobClient:  map 0% reduce 0%
>> > >> 12/02/15 11:16:38 INFO mapred.JobClient: Task Id : attempt_201202131515_0126_m_000000_0,
Status : FAILED
>> > >> java.lang.IllegalStateException: No clusters found. Check your -c path.
>> > >>         at org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:60)
>> > >>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>> > >>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>> > >>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>> > >>         at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> > >> > Subject: Re: Mahout 0.5 java.lang.IllegalStateException: No clusters
found. Check your -c path.
>> > >> > From: suneel_marthi@yahoo.com
>> > >> > Date: Tue, 14 Feb 2012 21:50:22 -0500
>> > >> > To: user@mahout.apache.org
>> > >> >
>> > >> > Did u specify the -cl option when executing kmeans?
>> > >> >
>> > >> > Sent from my iPhone
>> > >> >
>> > >> > On Feb 14, 2012, at 9:18 PM, Qiang Xu <xxqonline@hotmail.com>
wrote:
>> > >> >
>> > >> > >
>> > >> > > I think there is nothing wrong with the path.
>> > >> > >
>> > >> > > Because the /user/root/examples/bin/work/clusters is generated
by kmeans example.
>> > >> > >
>> > >> > > All my steps are:
>> > >> > >
>> > >> > > ./bin/mahout org.apache.lucene.benchmark.utils.ExtractReuters
./examples/bin/work/reuters-sgm/ ./examples/bin/work/reuters-out/
>> > >> > >
>> > >> > > ./bin/mahout seqdirectory -i ./examples/bin/work/reuters-out/
-o ./examples/bin/work/reuters-out-seqdir -c UTF-8 -chunk 5 -ow
>> > >> > >
>> > >> > > ./bin/mahout seq2sparse -i ./examples/bin/work/reuters-out-seqdir/
-o ./examples/bin/work/reuters-out-seqdir-sparse
>> > >> > >
>> > >> > > ./bin/mahout kmeans -i
>> > >> > > ./examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/
-c
>> > >> > > ./examples/bin/work/clusters -o ./examples/bin/work/reuters-kmeans
-x 10
>> > >> > > -k 20 -ow
>> > >> > >
>> > >> > > ./bin/mahout clusterdump -s examples/bin/work/reuters-kmeans/clusters-10
>> > >> > > -d examples/bin/work/reuters-out-seqdir-sparse/dictionary.file-0
-dt
>> > >> > > sequencefile -b 100 -n 20
>> > >> > >
>> > >> > > I have also tested with aboosolute path of hdfs as following:
>> > >> > >
>> > >> > > [root@qxutest mahout-distribution-0.5]# hadoop fs -ls /user/root/examples/bin/work/
>> > >> > >
>> > >> > > Found 4 items
>> > >> > >
>> > >> > > drwxr-xr-x   - root supergroup          0 2012-02-14
20:55 /user/root/examples/bin/work/clusters
>> > >> > >
>> > >> > > drwxr-xr-x   - root supergroup          0 2012-02-14
20:56 /user/root/examples/bin/work/reuters-kmeans
>> > >> > >
>> > >> > > drwxr-xr-x   - root supergroup          0 2012-02-14
20:29 /user/root/examples/bin/work/reuters-out-seqdir
>> > >> > >
>> > >> > > drwxr-xr-x   - root supergroup          0 2012-02-14
20:32 /user/root/examples/bin/work/reuters-out-seqdir-sparse
>> > >> > >
>> > >> > > [root@qxutest mahout-distribution-0.5]# hadoop fs -ls /user/root/examples/bin/work/clusters
>> > >> > >
>> > >> > > Found 1 items
>> > >> > >
>> > >> > > rw-rr-   2 root supergroup        139 2012-02-14 20:55
/user/root/examples/bin/work/clusters/part-randomSeed
>> > >> > >
>> > >> > > [root@qxutest mahout-distribution-0.5]#
>> > >> > > ./bin/mahout kmeans -i
>> > >> > > /user/root/examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/
-c
>> > >> > >  /user/root/examples/bin/work/clusters -o
>> > >> > > /user/root/examples/bin/work/reuters-kmeans -x 10  -ow
>> > >> > >
>> > >> > > Running on hadoop, using HADOOP_HOME=/data/hadoop_cluster/hadoop-0.20.2/
>> > >> > >
>> > >> > > HADOOP_CONF_DIR=/data/hadoop_cluster/hadoop-0.20.2/conf/
>> > >> > >
>> > >> > > 12/02/15 10:32:25 INFO common.AbstractJob: Command line arguments:
>> > >> > > {--clusters=/user/root/examples/bin/work/clusters,
>> > >> > > --convergenceDelta=0.5,
>> > >> > > --distanceMeasure=org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure,
>> > >> > > --endPhase=2147483647,
>> > >> > > --input=/user/root/examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/,
>> > >> > > --maxIter=10, --method=mapreduce,
>> > >> > > --output=/user/root/examples/bin/work/reuters-kmeans, --overwrite=null,
>> > >> > > --startPhase=0, --tempDir=temp}
>> > >> > >
>> > >> > > 12/02/15 10:32:25 INFO common.HadoopUtil: Deleting /user/root/examples/bin/work/reuters-kmeans
>> > >> > >
>> > >> > > 12/02/15 10:32:25 INFO kmeans.KMeansDriver: Input:
>> > >> > > /user/root/examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors
>> > >> > > Clusters In: /user/root/examples/bin/work/clusters Out:
>> > >> > > /user/root/examples/bin/work/reuters-kmeans Distance:
>> > >> > > org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure
>> > >> > >
>> > >> > > 12/02/15 10:32:25 INFO kmeans.KMeansDriver: convergence:
0.5 max
>> > >> > > Iterations: 10 num Reduce Tasks: org.apache.mahout.math.VectorWritable
>> > >> > > Input Vectors: {}
>> > >> > >
>> > >> > > 12/02/15 10:32:25 INFO kmeans.KMeansDriver: K-Means Iteration
1
>> > >> > >
>> > >> > > 12/02/15 10:32:26 INFO input.FileInputFormat: Total input
paths to process : 1
>> > >> > >
>> > >> > > 12/02/15 10:32:27 INFO mapred.JobClient: Running job: job_201202131515_0123
>> > >> > >
>> > >> > > 12/02/15 10:32:28 INFO mapred.JobClient:  map 0% reduce
0%
>> > >> > >
>> > >> > > 12/02/15 10:32:38 INFO mapred.JobClient: Task Id : attempt_201202131515_0123_m_000000_0,
Status : FAILED
>> > >> > >
>> > >> > > java.lang.IllegalStateException: No clusters found. Check
your -c path.
>> > >> > >
>> > >> > >        at org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:60)
>> > >> > >
>> > >> > >        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>> > >> > >
>> > >> > >        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>> > >> > >
>> > >> > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>> > >> > >
>> > >> > >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> > >> > >
>> > >> > >> From: xxqonline@hotmail.com
>> > >> > >> To: user@mahout.apache.org
>> > >> > >> Subject: RE: Mahout 0.5 java.lang.IllegalStateException:
No clusters found. Check your -c path.
>> > >> > >> Date: Tue, 14 Feb 2012 23:47:53 +0800
>> > >> > >>
>> > >> > >>
>> > >> > >>
>> > >> > >> I have checked 0.5 and 0.6 package, both of them have
this problem.Could you give me a work around or temp fixing?> From: xxqonline@hotmail.com
>> > >> > >>> To: user@mahout.apache.org
>> > >> > >>> Subject: Mahout 0.5 java.lang.IllegalStateException:
No clusters found. Check your -c path.
>> > >> > >>> Date: Tue, 14 Feb 2012 20:47:49 +0800
>> > >> > >>>
>> > >> > >>>
>> > >> > >>>
>> > >> > >>>
>> > >> > >>>
>> > >> > >>> Hello guys:        I am using Mahout 0.5,I follow
the guide in https://cwiki.apache.org/MAHOUT/k-means-clustering.html to run kmeans.But I got
the following error.Mahout 0.5 java.lang.IllegalStateException: No clusters found. Check your
-c path.  It seems been fix in 0.4 https://issues.apache.org/jira/browse/MAHOUT-504?focusedCommentId=13207675#comment-13207675But
it is still in mahout 0.5Could some one give me a work around way? Regards,skaterxu ./bin/mahout
kmeans -i ./examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/ -c ./examples/bin/work/clusters
-o ./examples/bin/work/reuters-kmeans -x 10  -ow
>> > >> > >>> Running on hadoop, using HADOOP_HOME=/data/hadoop_cluster/hadoop-0.20.2/
>> > >> > >>> HADOOP_CONF_DIR=/data/hadoop_cluster/hadoop-0.20.2/conf/
>> > >> > >>> 12/02/14 20:56:03 INFO common.AbstractJob: Command
line arguments: {--clusters=./examples/bin/work/clusters, --convergenceDelta=0.5, --distanceMeasure=org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure,
--endPhase=2147483647, --input=./examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors/,
--maxIter=10, --method=mapreduce, --output=./examples/bin/work/reuters-kmeans, --overwrite=null,
--startPhase=0, --tempDir=temp}
>> > >> > >>> 12/02/14 20:56:03 INFO kmeans.KMeansDriver: Input:
examples/bin/work/reuters-out-seqdir-sparse/tfidf-vectors Clusters In: examples/bin/work/clusters
Out: examples/bin/work/reuters-kmeans Distance: org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure
>> > >> > >>> 12/02/14 20:56:03 INFO kmeans.KMeansDriver: convergence:
0.5 max Iterations: 10 num Reduce Tasks: org.apache.mahout.math.VectorWritable Input Vectors:
{}
>> > >> > >>> 12/02/14 20:56:03 INFO kmeans.KMeansDriver: K-Means
Iteration 1
>> > >> > >>> 12/02/14 20:56:05 INFO input.FileInputFormat: Total
input paths to process : 1
>> > >> > >>> 12/02/14 20:56:06 INFO mapred.JobClient: Running
job: job_201202131515_0122
>> > >> > >>> 12/02/14 20:56:07 INFO mapred.JobClient:  map 0%
reduce 0%
>> > >> > >>> 12/02/14 20:56:16 INFO mapred.JobClient: Task Id
: attempt_201202131515_0122_m_000000_0, Status : FAILED
>> > >> > >>> java.lang.IllegalStateException: No clusters found.
Check your -c path.
>> > >> > >>>        at org.apache.mahout.clustering.kmeans.KMeansMapper.setup(KMeansMapper.java:60)
>> > >> > >>>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>> > >> > >>>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>> > >> > >>>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>> > >> > >>>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> > >> > >>> It is really weired that cluster is gernerated
>> > >> > >>> [root@qxutest mahout-distribution-0.5]# hadoop fs
-ls /user/root/examples/bin/work/
>> > >> > >>> Found 4 items
>> > >> > >>> drwxr-xr-x   - root supergroup          0 2012-02-14
20:55 /user/root/examples/bin/work/clusters
>> > >> > >>> drwxr-xr-x   - root supergroup          0 2012-02-14
20:56 /user/root/examples/bin/work/reuters-kmeans
>> > >> > >>> drwxr-xr-x   - root supergroup          0 2012-02-14
20:29 /user/root/examples/bin/work/reuters-out-seqdir
>> > >> > >>> drwxr-xr-x   - root supergroup          0 2012-02-14
20:32 /user/root/examples/bin/work/reuters-out-seqdir-sparse
>> > >> > >>> [root@qxutest mahout-distribution-0.5]# hadoop fs
-ls /user/root/examples/bin/work/clusters
>> > >> > >>> Found 1 items
>> > >> > >>> rw-rr-   2 root supergroup        139 2012-02-14
20:55 /user/root/examples/bin/work/clusters/part-randomSeed
>> > >> > >>
>> > >> > >
>> > >>
>> > >
>> >
>> >
>> >
>> > --
>> > Lance Norskog
>> > goksron@gmail.com
>>
>

Mime
View raw message