mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <j...@windwardsolutions.com>
Subject Re: problems running mahout kmeans example
Date Fri, 15 May 2009 20:03:53 GMT
I also just ran the release mahout-examples-0.1.job and it ran fine too. 
Are you running Hadoop-0.19.1?, JDK 1.6? I can't duplicate your problem.

Jeff


Jeff Eastman wrote:
> Hi Glenn,
>
> I'm not sure what is going on with your runs. I suggest building from 
> trunk and running that. I just ran kmeans from trunk on a small EC2 
> cluster and it ran just fine.
>
> http://cwiki.apache.org/MAHOUT/buildingmahout.html
>
>
> ./bin/hadoop jar ~/mahout-examples-0.2-SNAPSHOT.job  
> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
>
> Jeff
>
>
>
>
> Wasson, Glenn S. wrote:
>> Jeff,
>>
>>     Yes, it looks like they ran successfully because I have
>> part-00000 and part-00001 in output/data. These files have data in them,
>> so I'm assuming they are the correct output.
>> It seems as though 3 jobs complete successfully with the mappers:
>> org.apache.mahout.clustering.syntheticcontrol.canopy.InputMapper
>> org.apache.mahout.clustering.canopy.CanopyMapper
>> org.apache.mahout.clustering.canopy.ClusterMapper
>>
>> the last of these seems to be producing successful output in
>> output/clusters (the part-00000 file there is roughly 2.7M).
>>
>> The jobs that fail are the kmeans ones with the error code below.
>>
>>
>> Glenn
>>
>>
>> Glenn Wasson, Ph.D. | SAIC
>> Computer Scientist | IST Group
>> phone: 434.964.3070 | fax 434.974.1172
>> email: glenn.s.wasson@saic.com
>> Please consider the environment before printing this email.
>>
>> -----Original Message-----
>> From: mahout-user-return-548-GLENN.S.WASSON=saic.com@lucene.apache.org
>> [mailto:mahout-user-return-548-GLENN.S.WASSON=saic.com@lucene.apache.org
>> ] On Behalf Of Jeff Eastman
>> Sent: Thursday, May 14, 2009 6:13 PM
>> To: mahout-user@lucene.apache.org
>> Subject: Re: problems running mahout kmeans example
>>
>> Hi Glenn,
>>
>> Did you verify that the canopy job really ran and produced data in 
>> output/data? I can't tell much from the log fragment.
>>
>> Jeff
>>
>>
>> Wasson, Glenn S. wrote:
>>  
>>> Hello,
>>>
>>>  
>>>
>>>                 I'm trying to run the synthetic control kmeans
>>> clustering example unsuccessfully. My command-line (simplified to use
>>> the default parameters) is:
>>>
>>>  
>>>
>>> hadoop jar
>>> /home/gwasson/mahout-0.1/examples/target/mahout-examples-0.1.job
>>> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
>>>
>>>  
>>>
>>> It looks like the canopy input map and canopy jobs run, but the kmeans
>>> part fails with:
>>>
>>>  
>>>
>>> 09/05/14 16:41:42 INFO mapred.JobClient: Task Id :
>>> attempt_200905141458_0028_m_000001_1, Status : FAILED
>>>
>>> java.lang.RuntimeException: Error in configuring object
>>>
>>>                 at
>>>
>>>     
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:9
>>  
>>> 3)
>>>
>>>                 at
>>>
>>>     
>> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>  
>>>                 at
>>>
>>>     
>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:
>>  
>>> 117)
>>>
>>>                 at
>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:352)
>>>
>>>                 at
>>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>>
>>>                 at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>>
>>> Caused by: java.lang.reflect.InvocationTargetException
>>>
>>>                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> Method)
>>>
>>>                 at
>>>
>>>     
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
>>  
>>> a:39)
>>>
>>>                 at
>>>
>>>     
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
>>  
>>> Impl.java:25)
>>>
>>>                 at java.lang.reflect.Method.invoke(Method.java:597)
>>>
>>>                 at
>>>
>>>     
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:8
>>  
>>> 8)
>>>
>>>                 ... 5 more
>>>
>>> Caused by: java.lang.RuntimeException: Error in configuring object
>>>
>>>                 at
>>>
>>>     
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:9
>>  
>>> 3)
>>>
>>>                 at
>>>
>>>     
>> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>  
>>>                 at
>>>
>>>     
>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:
>>  
>>> 117)
>>>
>>>                 at
>>> org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>>>
>>>                 ... 10 more
>>>
>>> Caused by: java.lang.reflect.InvocationTargetException
>>>
>>>                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> Method)
>>>
>>>                 at
>>>
>>>     
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
>>  
>>> a:39)
>>>
>>>                 at
>>>
>>>     
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
>>  
>>> Impl.java:25)
>>>
>>>                 at java.lang.reflect.Method.invoke(Method.java:597)
>>>
>>>                 at
>>>
>>>     
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:8
>>  
>>> 8)
>>>
>>>                 ... 13 more
>>>
>>> Caused by: java.lang.NullPointerException
>>>
>>>                 at
>>> org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:775)
>>>
>>>                 at
>>>
>>>     
>> org.apache.mahout.clustering.kmeans.KMeansUtil.configureWithClusterInfo(
>>  
>>> KMeansUtil.java:63)
>>>
>>>                 at
>>>
>>>     
>> org.apache.mahout.clustering.kmeans.KMeansMapper.configure(KMeansMapper.
>>  
>>> java:61)
>>>
>>>                 ... 18 more
>>>
>>>  
>>>
>>> Seems as though it is not able to stat the files it expects to find.
>>> Does anyone know if I should expect this command line to work as
>>> formulated? I have placed the suggested data file in a testdata
>>> directory (and the canopy jobs are finding it, so I know that part is
>>> ok). I'm guessing that somehow the output from the canopy job is not
>>> being read in correctly.
>>>
>>>  
>>>
>>> Appreciate any help anyone can give,
>>>
>>>  
>>>
>>> Glenn
>>>
>>>  
>>>
>>>  
>>>
>>> Glenn Wasson, Ph.D. | SAIC
>>>
>>> Computer Scientist | IST Group
>>>
>>> phone: 434.964.3070 | fax 434.974.1172
>>>
>>> email: glenn.s.wasson@saic.com
>>>  
>>>
>>> Science Applications International Corporation
>>>
>>> 1001 Research Park Blvd., Suite 210
>>>
>>> Charlottesville, VA 22911
>>>
>>> www.saic.com
>>>
>>>  
>>>
>>> Energy  |  Environment  |  National Security  |  Health  |  Critical
>>> Infrastructure
>>>
>>>  
>>>
>>> Please consider the environment before printing this email.
>>>
>>>  
>>>
>>> This e-mail and any attachments to it are intended only for the
>>> identified recipients. It may contain proprietary or otherwise legally
>>> protected information of SAIC. Any unauthorized use or disclosure of
>>> this communication is strictly prohibited. If you have received this
>>> communication in error, please notify the sender and delete or
>>>     
>> otherwise
>>  
>>> destroy the e-mail and all attachments immediately.
>>>
>>>  
>>>
>>>
>>>       
>>
>>
>>
>>   
>
>
>


Mime
View raw message