mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Writing java program for performing kmeans clustering on reuters dataset instead of ./mahout <seqdirectory | seq2sparse | kmeans| clusterdump > ,Steps to Follow
Date Wed, 04 Jan 2012 22:18:54 GMT
Here is the problem: the Eclipse MVN plugin does not support all of
Maven pom.xml directives. It is failing because it does not know about
a few. Ignore this.

You can run non-map/reduce programs in Eclipse, and you can run unit
tests. Running map/reduce jobs out of Eclipse is possible but iffy.
Here are your easy techniques:

1) You can do a maven build&install (listed below) and run the
bin/mahout program.
2) You can run 'bin/mahout -core' instead of bin/mahout. This causes
Mahout to load from the target/ directories in the different projects.
Eclipse builds into these directories.

On Wed, Jan 4, 2012 at 1:07 AM, rahul raghavendhra
<rahulraghavendhra1@gmail.com> wrote:
> On Wed, Jan 4, 2012 at 2:24 PM, Paritosh Ranjan <pranjan@xebia.com> wrote:
>
>> Do a "mvn clean install -Dmaven.test.skip" on parent pom/directory..
>
>
> I untar mahout source.zip and i tried mvn eclipse:eclipse.. i have in
> already svn the code built it using mvn and i have tried kmeans, its
> working well..
>
> shall i move that trunk to /home/usename/workspace and do mvn
> eclipse:eclipse and load that trunk in eclipse ?
>
> but that trunk has only
> core          doap_Mahout.rdf  integration  math     README.txt  target
> buildtools  distribution  examples         LICENSE.txt  NOTICE.txt
> pom.xml    src
>
> i dont have have taste-web in that trunk..
>
> [ OR ]
>
> build the source again using maven install -DskipTests and move to
> workspace and do mvn eclipse:eclipse and load it into eclipse
>
> please guide me   what i have to do now ?
>
> please help.. thanks in addvance..
>
>
>> On 04-01-2012 14:21, rahul raghavendhra wrote:
>>
>>> On Tue, Jan 3, 2012 at 4:05 PM, praveenesh kumar<praveenesh@gmail.com>**
>>> wrote:
>>>
>>>  Have you tried this link ?
>>>>
>>>> http://shuyo.wordpress.com/**2011/02/14/mahout-development-**
>>>> environment-with-maven-and-**eclipse-2/<http://shuyo.wordpress.com/2011/02/14/mahout-development-environment-with-maven-and-eclipse-2/>
>>>>
>>>>
>>>>  It is telling you how to import mahout in action examples in eclipse.
>>>>> Just add Hadoop and mahout dependencies in pom.xml and there is a small
>>>>> Mahout in action example to run Kmeans clustering, you can use that
>>>>>
>>>>  Thanks for ur reply.. i tried this for setting mahout in ecclipse. i got
>>> error in mahout-core, mahout-buildtools, mahout-examples, mahout-math
>>> etc..
>>> got error in all when importing them from file system(i have done mvn
>>> eclipse:eclipse and added maven-repo too ) please help.. why this error ?
>>>
>>>
>>> thanks in advance..please help..
>>>
>>> ./rahul
>>>
>>>
>>> On Tue, Jan 3, 2012 at 4:01 PM, Paritosh Ranjan<pranjan@xebia.com>
>>>  wrote:
>>>
>>>> I think mahout-core ( and its internal dependencies ) can do most of what
>>>>> you need.
>>>>>
>>>>> You will have to create your vectors yourself and write to HDFS.
>>>>>
>>>>> Then use KMeansDriver's run method to do clustering.
>>>>>
>>>>> Then use ClusterOutputPostProcessor to separate out vectors belonging
to
>>>>> different clusters in their specific directories.
>>>>>
>>>>> Then write some code to read the cluster specific clusters.
>>>>>
>>>>> PS : Reading and writing from HDFS is simple.
>>>>>
>>>>>
>>>>> On 03-01-2012 15:57, rahul raghavendhra wrote:
>>>>>
>>>>>  I am new to mahout, i have svn the trunk and installed it using mvn..
>>>>>>
>>>>> now
>>>>
>>>>> i
>>>>>> wish to write a java program(instead of the shell script
>>>>>> build-reuters.sh/cluster-****reuters.sh<http://build-reuters.sh/cluster-**reuters.sh>
>>>>>> <
>>>>>>
>>>>> http://build-reuters.sh/**cluster-reuters.sh<http://build-reuters.sh/cluster-reuters.sh>
>>>> >)
>>>>
>>>>> that performs a kmeans clustering by
>>>>>> calling the methods or by creating instance (if possible) in the
>>>>>> classes
>>>>>> which convert the dataset into sequence file then to sparse and then
>>>>>>
>>>>> apply
>>>>
>>>>> kmeans   and then cluster dump and display kmeans..
>>>>>>
>>>>>> i.e , to perform kmeans clustering on reuters dataset without using
>>>>>> ./mahout<seqdirectory | seq2sparse | kmeans| clusterdump>
>>>>>>
>>>>>> what r the jars needed to be imported..
>>>>>>
>>>>>> Can that program can be developed using eclipse and run it there
?
>>>>>>
>>>>>> kindly help, what are the steps to follow ?
>>>>>>
>>>>>> thanks in advance
>>>>>>
>>>>>> ./rahul
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----
>>>>>> No virus found in this message.
>>>>>> Checked by AVG - www.avg.com
>>>>>> Version: 10.0.1416 / Virus Database: 2109/4119 - Release Date: 01/02/12
>>>>>>
>>>>>>
>>>>>
>>>
>>> -----
>>> No virus found in this message.
>>> Checked by AVG - www.avg.com
>>> Version: 10.0.1416 / Virus Database: 2109/4121 - Release Date: 01/03/12
>>>
>>
>>



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message