Author: buildbot
Date: Thu Nov 21 11:17:31 2013
New Revision: 887498
Log:
Staging update by buildbot for mahout
Modified:
websites/staging/mahout/trunk/content/ (props changed)
websites/staging/mahout/trunk/content/users/clustering/mean-shift-clustering.html
Propchange: websites/staging/mahout/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Thu Nov 21 11:17:31 2013
@@ -1 +1 @@
-1544117
+1544119
Modified: websites/staging/mahout/trunk/content/users/clustering/mean-shift-clustering.html
==============================================================================
--- websites/staging/mahout/trunk/content/users/clustering/mean-shift-clustering.html (original)
+++ websites/staging/mahout/trunk/content/users/clustering/mean-shift-clustering.html Thu
Nov 21 11:17:31 2013
@@ -381,13 +381,12 @@
<div id="content-wrap" class="clearfix">
<div id="main">
- <p>"Mean Shift: A Robust Approach to Feature Space Analysis"
-(http://www.caip.rutgers.edu/riul/research/papers/pdf/mnshft.pdf)
+ <h1 id="means-shift-clustering">Means Shift clustering</h1>
+<p><a href="http://www.caip.rutgers.edu/riul/research/papers/pdf/mnshft.pdf">"Mean
Shift: A Robust Approach to Feature Space Analysis"</a>
introduces the geneology of the mean shift custering procedure which dates
back to work in pattern recognition in 1975. The paper contains a detailed
derivation and several examples of the use of mean shift for image smooting
-and segmentation. "Mean Shift Clustering"
-(http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/TUZEL1/MeanShift.pdf)
+and segmentation. <a href="http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/TUZEL1/MeanShift.pdf">"Mean
Shift Clustering"</a>
presents an overview of the algorithm with a summary of the derivation. An
attractive feature of mean shift clustering is that it does not require
a-priori knowledge of the number of clusters (as required in k-means) and
@@ -443,19 +442,18 @@ MeanShiftCanopyDriver.run(). </p>
<div class="codehilite"><pre><span class="n">bin</span><span class="o">/</span><span
class="n">mahout</span> <span class="n">meanshift</span> <span class="o">\</span>
<span class="o">-</span><span class="nb">i</span> <span class="o"><</span><span
class="n">input</span> <span class="n">vectors</span> <span class="n">directory</span><span
class="o">></span> <span class="o">\</span>
<span class="o">-</span><span class="n">o</span> <span class="o"><</span><span
class="n">output</span> <span class="n">working</span> <span class="n">directory</span><span
class="o">></span> <span class="o">\</span>
- <span class="o">-</span><span class="n">inputIsCanopies</span>
<span class="o"><</span><span class="n">input</span> <span
class="n">directory</span> <span class="n">contains</span> <span class="n">mean</span>
<span class="n">shift</span> <span class="n">canopies</span> <span
class="n">not</span>
+ <span class="o">-</span><span class="n">inputIsCanopies</span>
<span class="o"><</span><span class="n">input</span> <span
class="n">directory</span> <span class="n">contains</span> <span class="n">mean</span>
<span class="n">shift</span> <span class="n">canopies</span> <span
class="n">not</span> <span class="n">vectors</span><span class="o">></span>
<span class="o">\</span>
+ <span class="o">-</span><span class="n">dm</span> <span class="o"><</span><span
class="n">DistanceMeasure</span><span class="o">></span> <span
class="o">\</span>
+ <span class="o">-</span><span class="n">t1</span> <span class="o"><</span><span
class="n">the</span> <span class="n">T1</span> <span class="n">threshold</span><span
class="o">></span> <span class="o">\</span>
+ <span class="o">-</span><span class="n">t2</span> <span class="o"><</span><span
class="n">the</span> <span class="n">T2</span> <span class="n">threshold</span><span
class="o">></span> <span class="o">\</span>
+ <span class="o">-</span><span class="n">x</span> <span class="o"><</span><span
class="n">maximum</span> <span class="n">number</span> <span class="n">of</span>
<span class="n">iterations</span><span class="o">></span> <span
class="o">\</span>
+ <span class="o">-</span><span class="n">cd</span> <span class="o"><</span><span
class="n">optional</span> <span class="n">convergence</span> <span
class="n">delta</span><span class="p">.</span> <span class="n">Default</span>
<span class="n">is</span> 0<span class="p">.</span>5<span class="o">></span>
<span class="o">\</span>
+ <span class="o">-</span><span class="n">ow</span> <span class="o"><</span><span
class="n">overwrite</span> <span class="n">output</span> <span class="n">directory</span>
<span class="k">if</span> <span class="n">present</span><span class="o">></span>
+ <span class="o">-</span><span class="n">cl</span> <span class="o"><</span><span
class="n">run</span> <span class="n">input</span> <span class="n">vector</span>
<span class="n">clustering</span> <span class="n">after</span> <span
class="n">computing</span> <span class="n">Clusters</span><span class="o">></span>
+ <span class="o">-</span><span class="n">xm</span> <span class="o"><</span><span
class="n">execution</span> <span class="n">method</span><span class="p">:</span>
<span class="n">sequential</span> <span class="n">or</span> <span
class="n">mapreduce</span><span class="o">></span>
</pre></div>
-<p>vectors> \
- -dm <DistanceMeasure> \
- -t1 <the T1 threshold> \
- -t2 <the T2 threshold> \
- -x <maximum number of iterations> \
- -cd <optional convergence delta. Default is 0.5> \
- -ow <overwrite output directory if present>
- -cl <run input vector clustering after computing Clusters>
- -xm <execution method: sequential or mapreduce></p>
<p>Invocation using Java involves supplying the following arguments:</p>
<ol>
<li>input: a file path string to a directory containing the input data set a
@@ -511,19 +509,19 @@ deviation. See the README file in the <a
</ul>
<p>In the first image, the points are plotted and the 3-sigma boundaries of
their generator are superimposed. </p>
-<p>!SampleData.png!</p>
+<p><img alt="clustering" src="../../SampleData.png" /></p>
<p>In the second image, the resulting clusters (k=3) are shown superimposed
upon the sample data. In this image, each cluster renders in a different
color and the T1 and T2 radii are superimposed upon the final cluster
centers determined by the algorithm. Mean Shift does an excellent job of
clustering this data, though by its design the cluster membership is unique
and the clusters do not overlap. </p>
-<p>!MeanShift.png!</p>
+<p><img alt="clustering" src="../../MeanShift.png" /></p>
<p>The third image shows the results of running Mean Shift on a different data
set (see <a href="dirichlet-process-clustering.html">Dirichlet Process Clustering</a>
for details) which is generated using asymmetrical standard deviations.
Mean Shift does an excellent job of clustering this data set too.</p>
-<p>!2dMeanShift.png!</p>
+<p><img alt="clustering" src="../../2dMeanShift.png" /></p>
</div>
</div>
</div>
|