mahout-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject svn commit: r887498 - in /websites/staging/mahout/trunk/content: ./ users/clustering/mean-shift-clustering.html
Date Thu, 21 Nov 2013 11:17:31 GMT
Author: buildbot
Date: Thu Nov 21 11:17:31 2013
New Revision: 887498

Staging update by buildbot for mahout

    websites/staging/mahout/trunk/content/   (props changed)

Propchange: websites/staging/mahout/trunk/content/
--- cms:source-revision (original)
+++ cms:source-revision Thu Nov 21 11:17:31 2013
@@ -1 +1 @@

Modified: websites/staging/mahout/trunk/content/users/clustering/mean-shift-clustering.html
--- websites/staging/mahout/trunk/content/users/clustering/mean-shift-clustering.html (original)
+++ websites/staging/mahout/trunk/content/users/clustering/mean-shift-clustering.html Thu
Nov 21 11:17:31 2013
@@ -381,13 +381,12 @@
   <div id="content-wrap" class="clearfix">
    <div id="main">
-    <p>"Mean Shift: A Robust Approach to Feature Space Analysis"
+    <h1 id="means-shift-clustering">Means Shift clustering</h1>
+<p><a href="">"Mean
Shift: A Robust Approach to Feature Space Analysis"</a>
 introduces the geneology of the mean shift custering procedure which dates
 back to work in pattern recognition in 1975. The paper contains a detailed
 derivation and several examples of the use of mean shift for image smooting
-and segmentation. "Mean Shift Clustering"
+and segmentation. <a href="">"Mean
Shift Clustering"</a>
 presents an overview of the algorithm with a summary of the derivation. An
 attractive feature of mean shift clustering is that it does not require
 a-priori knowledge of the number of clusters (as required in k-means) and
@@ -443,19 +442,18 @@ </p>
 <div class="codehilite"><pre><span class="n">bin</span><span class="o">/</span><span
class="n">mahout</span> <span class="n">meanshift</span> <span class="o">\</span>
     <span class="o">-</span><span class="nb">i</span> <span class="o">&lt;</span><span
class="n">input</span> <span class="n">vectors</span> <span class="n">directory</span><span
class="o">&gt;</span> <span class="o">\</span>
     <span class="o">-</span><span class="n">o</span> <span class="o">&lt;</span><span
class="n">output</span> <span class="n">working</span> <span class="n">directory</span><span
class="o">&gt;</span> <span class="o">\</span>
-    <span class="o">-</span><span class="n">inputIsCanopies</span>
<span class="o">&lt;</span><span class="n">input</span> <span
class="n">directory</span> <span class="n">contains</span> <span class="n">mean</span>
<span class="n">shift</span> <span class="n">canopies</span> <span
+    <span class="o">-</span><span class="n">inputIsCanopies</span>
<span class="o">&lt;</span><span class="n">input</span> <span
class="n">directory</span> <span class="n">contains</span> <span class="n">mean</span>
<span class="n">shift</span> <span class="n">canopies</span> <span
class="n">not</span> <span class="n">vectors</span><span class="o">&gt;</span>
<span class="o">\</span>
+    <span class="o">-</span><span class="n">dm</span> <span class="o">&lt;</span><span
class="n">DistanceMeasure</span><span class="o">&gt;</span> <span
+    <span class="o">-</span><span class="n">t1</span> <span class="o">&lt;</span><span
class="n">the</span> <span class="n">T1</span> <span class="n">threshold</span><span
class="o">&gt;</span> <span class="o">\</span>
+    <span class="o">-</span><span class="n">t2</span> <span class="o">&lt;</span><span
class="n">the</span> <span class="n">T2</span> <span class="n">threshold</span><span
class="o">&gt;</span> <span class="o">\</span>
+    <span class="o">-</span><span class="n">x</span> <span class="o">&lt;</span><span
class="n">maximum</span> <span class="n">number</span> <span class="n">of</span>
<span class="n">iterations</span><span class="o">&gt;</span> <span
+    <span class="o">-</span><span class="n">cd</span> <span class="o">&lt;</span><span
class="n">optional</span> <span class="n">convergence</span> <span
class="n">delta</span><span class="p">.</span> <span class="n">Default</span>
<span class="n">is</span> 0<span class="p">.</span>5<span class="o">&gt;</span>
<span class="o">\</span>
+    <span class="o">-</span><span class="n">ow</span> <span class="o">&lt;</span><span
class="n">overwrite</span> <span class="n">output</span> <span class="n">directory</span>
<span class="k">if</span> <span class="n">present</span><span class="o">&gt;</span>
+    <span class="o">-</span><span class="n">cl</span> <span class="o">&lt;</span><span
class="n">run</span> <span class="n">input</span> <span class="n">vector</span>
<span class="n">clustering</span> <span class="n">after</span> <span
class="n">computing</span> <span class="n">Clusters</span><span class="o">&gt;</span>
+    <span class="o">-</span><span class="n">xm</span> <span class="o">&lt;</span><span
class="n">execution</span> <span class="n">method</span><span class="p">:</span>
<span class="n">sequential</span> <span class="n">or</span> <span
class="n">mapreduce</span><span class="o">&gt;</span>
-<p>vectors&gt; \
-        -dm <DistanceMeasure> \
-        -t1 <the T1 threshold> \
-        -t2 <the T2 threshold> \
-        -x <maximum number of iterations> \
-        -cd <optional convergence delta. Default is 0.5> \
-        -ow <overwrite output directory if present>
-        -cl <run input vector clustering after computing Clusters>
-        -xm <execution method: sequential or mapreduce></p>
 <p>Invocation using Java involves supplying the following arguments:</p>
 <li>input: a file path string to a directory containing the input data set a
@@ -511,19 +509,19 @@ deviation. See the README file in the <a
 <p>In the first image, the points are plotted and the 3-sigma boundaries of
 their generator are superimposed. </p>
+<p><img alt="clustering" src="../../SampleData.png" /></p>
 <p>In the second image, the resulting clusters (k=3) are shown superimposed
 upon the sample data. In this image, each cluster renders in a different
 color and the T1 and T2 radii are superimposed upon the final cluster
 centers determined by the algorithm. Mean Shift does an excellent job of
 clustering this data, though by its design the cluster membership is unique
 and the clusters do not overlap. </p>
+<p><img alt="clustering" src="../../MeanShift.png" /></p>
 <p>The third image shows the results of running Mean Shift on a different data
 set (see <a href="dirichlet-process-clustering.html">Dirichlet Process Clustering</a>
  for details) which is generated using asymmetrical standard deviations.
 Mean Shift does an excellent job of clustering this data set too.</p>
+<p><img alt="clustering" src="../../2dMeanShift.png" /></p>

View raw message