mahout-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From conflue...@apache.org
Subject [CONF] Apache Lucene Mahout: SyntheticControlData (page edited)
Date Thu, 11 Jun 2009 09:52:00 GMT
SyntheticControlData (MAHOUT) edited by Robert Burrell Donkin
      Page: http://cwiki.apache.org/confluence/display/MAHOUT/SyntheticControlData
   Changes: http://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=103663&originalVersion=6&revisedVersion=7

Comment:
---------------------------------------------------------------------

Remove unwanted line breaks

Change summary:
---------------------------------------------------------------------

Remove unwanted line breaks

Change summary:
---------------------------------------------------------------------

Remove unwanted line breaks

Change summary:
---------------------------------------------------------------------

Remove unwanted line breaks

Change summary:
---------------------------------------------------------------------

Remove unwanted line breaks

Content:
---------------------------------------------------------------------

h1. Introduction

This quick start page shows how to run the clustering Synthetic Control Data example. The
data is described [here | http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data.html].


h1. Steps

* Download the data at http://archive.ics.uci.edu/ml/datasets/Synthetic+Control+Chart+Time+Series.

* In $MAHOUT_HOME/, build the Job file
** The same job is used for all examples so this only needs to be done once
** mvn install
** The job will be generated in $MAHOUT_HOME/examples/target/ and it's name will contain the
Mahout version number. For example, when using Mahout 0.1 release, the job will be mahout-examples-0.1.job
* (Optional){footnote}This step should be skipped when using standalone Hadoop{footnote} Start
up Hadoop: $HADOOP_HOME/bin/start-all.sh
* Put the data: $HADOOP_HOME/bin/hadoop fs -put <PATH TO DATA> testdata
* Run the Job: $HADOOP_HOME/bin/hadoop jar $MAHOUT_HOME/examples/target/mahout-examples-<MAHOUT
VERSION>.job  org.apache.mahout.clustering.syntheticcontrol.kmeans.Job {footnote}Substitute
in whichever Clustring Job you want here: KMeans, Canopy, etc. See subdirectories of $MAHOUT_HOME/examples/src/main/java/org/apache/mahout/clustering/syntheticcontrol/.{footnote}
** For [kmeans | k-Means]:  $HADOOP_HOME/bin/hadoop jar  $MAHOUT_HOME/examples/target/mahout-examples-<MAHOUT
VERSION>.job org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
** For [canopy | Canopy Clustering]:  $HADOOP_HOME/bin/hadoop jar  $MAHOUT_HOME/examples/target/mahout-examples-<MAHOUT
VERSION>.job  org.apache.mahout.clustering.syntheticcontrol.canopy.Job
** For [dirichlet | Dirichlet Process Clustering]: $HADOOP_HOME/bin/hadoop jar  $MAHOUT_HOME/examples/target/mahout-examples-<MAHOUT
VERSION>.job org.apache.mahout.clustering.syntheticcontrol.dirichlet.Job
** For [meanshift | Mean Shift]: $HADOOP_HOME/bin/hadoop jar  $MAHOUT_HOME/examples/target/mahout-examples-<MAHOUT
VERSION>.job org.apache.mahout.clustering.syntheticcontrol.meanshift.Job
* Get the data out of HDFS and have a look.

{display-footnotes}

---------------------------------------------------------------------
CONFLUENCE INFORMATION
This message is automatically generated by Confluence

Unsubscribe or edit your notifications preferences
   http://cwiki.apache.org/confluence/users/viewnotifications.action

If you think it was sent incorrectly contact one of the administrators
   http://cwiki.apache.org/confluence/administrators.action

If you want more information on Confluence, or have a bug to report see
   http://www.atlassian.com/software/confluence



Mime
View raw message