mahout-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r921243 - in /websites/staging/mahout/trunk/content: ./ users/recommender/intro-cooccurrence-spark.html
Date Thu, 04 Sep 2014 14:56:43 GMT
Author: buildbot
Date: Thu Sep  4 14:56:42 2014
New Revision: 921243

Log:
Staging update by buildbot for mahout

Modified:
    websites/staging/mahout/trunk/content/   (props changed)
    websites/staging/mahout/trunk/content/users/recommender/intro-cooccurrence-spark.html

Propchange: websites/staging/mahout/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Thu Sep  4 14:56:42 2014
@@ -1 +1 @@
-1621598
+1622492

Modified: websites/staging/mahout/trunk/content/users/recommender/intro-cooccurrence-spark.html
==============================================================================
--- websites/staging/mahout/trunk/content/users/recommender/intro-cooccurrence-spark.html (original)
+++ websites/staging/mahout/trunk/content/users/recommender/intro-cooccurrence-spark.html Thu Sep  4 14:56:42 2014
@@ -261,16 +261,15 @@ creating recommendations or similar item
 For instance they might say an item-view is 0.2 of an item purchase. In practice this is often not helpful. Spark-itemsimilarity's
 cross-cooccurrence is a more principled way to handle this case. In effect it scrubs secondary actions with the action you want
 to recommend.   </p>
-<div class="codehilite"><pre><span class="n">spark</span><span class="o">-</span><span class="n">itemsimilarity</span> <span class="n">Mahout</span> 1<span class="p">.</span>0<span class="o">-</span><span class="n">SNAPSHOT</span>
+<div class="codehilite"><pre><span class="n">spark</span><span class="o">-</span><span class="n">itemsimilarity</span> <span class="n">Mahout</span> 1<span class="p">.</span>0
 <span class="n">Usage</span><span class="p">:</span> <span class="n">spark</span><span class="o">-</span><span class="n">itemsimilarity</span> <span class="p">[</span><span class="n">options</span><span class="p">]</span>
 
+<span class="n">Disconnected</span> <span class="n">from</span> <span class="n">the</span> <span class="n">target</span> <span class="n">VM</span><span class="p">,</span> <span class="n">address</span><span class="p">:</span> <span class="s">&#39;127.0.0.1:64676&#39;</span><span class="p">,</span> <span class="n">transport</span><span class="p">:</span> <span class="s">&#39;socket&#39;</span>
 <span class="n">Input</span><span class="p">,</span> <span class="n">output</span> <span class="n">options</span>
   <span class="o">-</span><span class="nb">i</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">input</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Input</span> <span class="n">path</span><span class="p">,</span> <span class="n">may</span> <span class="n">be</span> <span class="n">a</span> <span class="n">filename</span><span class="p">,</span> <span class="n">directory</span> <span class="n">name</span><span class="p">,</span> <span class="n">or</span> <span class="n">comma</span> <span class="n">delimited</span> <span class="n">list</span> <span class="n">of</span> 
-        <span class="n">HDFS</span> <span class="n">supported</span> <span class="n">URIs</span> <span class="p">(</span><span class="n">required</span><span class="p">)</span>
+        <span class="n">Input</span> <span class="n">path</span><span class="p">,</span> <span class="n">may</span> <span class="n">be</span> <span class="n">a</span> <span class="n">filename</span><span class="p">,</span> <span class="n">directory</span> <span class="n">name</span><span class="p">,</span> <span class="n">or</span> <span class="n">comma</span> <span class="n">delimited</span> <span class="n">list</span> <span class="n">of</span> <span class="n">HDFS</span> <span class="n">supported</span> <span class="n">URIs</span> <span class="p">(</span><span class="n">required</span><span class="p">)</span>
   <span class="o">-</span><span class="n">i2</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">input2</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Secondary</span> <span class="n">input</span> <span class="n">path</span> <span class="k">for</span> <span class="nb">cross</span><span class="o">-</span><span class="n">similarity</span> <span class="n">calculation</span><span class="p">,</span> <span class="n">same</span> <span class="n">restrictions</span> 
-        <span class="n">as</span> &quot;<span class="o">--</span><span class="n">input</span>&quot; <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> <span class="n">empty</span><span class="p">.</span>
+        <span class="n">Secondary</span> <span class="n">input</span> <span class="n">path</span> <span class="k">for</span> <span class="nb">cross</span><span class="o">-</span><span class="n">similarity</span> <span class="n">calculation</span><span class="p">,</span> <span class="n">same</span> <span class="n">restrictions</span> <span class="n">as</span> &quot;<span class="o">--</span><span class="n">input</span>&quot; <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> <span class="n">empty</span><span class="p">.</span>
   <span class="o">-</span><span class="n">o</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">output</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
         <span class="n">Path</span> <span class="k">for</span> <span class="n">output</span><span class="p">,</span> <span class="n">any</span> <span class="n">local</span> <span class="n">or</span> <span class="n">HDFS</span> <span class="n">supported</span> <span class="n">URI</span> <span class="p">(</span><span class="n">required</span><span class="p">)</span>
 
@@ -278,8 +277,7 @@ to recommend.   </p>
   <span class="o">-</span><span class="n">mppu</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">maxPrefs</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
         <span class="n">Max</span> <span class="n">number</span> <span class="n">of</span> <span class="n">preferences</span> <span class="n">to</span> <span class="n">consider</span> <span class="n">per</span> <span class="n">user</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> 500
   <span class="o">-</span><span class="n">m</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">maxSimilaritiesPerItem</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Limit</span> <span class="n">the</span> <span class="n">number</span> <span class="n">of</span> <span class="n">similarities</span> <span class="n">per</span> <span class="n">item</span> <span class="n">to</span> <span class="n">this</span> <span class="n">number</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> 
-        <span class="n">Default</span><span class="p">:</span> 100
+        <span class="n">Limit</span> <span class="n">the</span> <span class="n">number</span> <span class="n">of</span> <span class="n">similarities</span> <span class="n">per</span> <span class="n">item</span> <span class="n">to</span> <span class="n">this</span> <span class="n">number</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> 100
 
 <span class="n">Note</span><span class="p">:</span> <span class="n">Only</span> <span class="n">the</span> <span class="n">Log</span> <span class="n">Likelihood</span> <span class="n">Ratio</span> <span class="p">(</span><span class="n">LLR</span><span class="p">)</span> <span class="n">is</span> <span class="n">supported</span> <span class="n">as</span> <span class="n">a</span> <span class="n">similarity</span> <span class="n">measure</span><span class="p">.</span>
 
@@ -287,56 +285,42 @@ to recommend.   </p>
   <span class="o">-</span><span class="n">id</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">inDelim</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
         <span class="n">Input</span> <span class="n">delimiter</span> <span class="n">character</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> &quot;<span class="p">[,</span><span class="o">\</span><span class="n">t</span><span class="p">]</span>&quot;
   <span class="o">-</span><span class="n">f1</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">filter1</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">String</span> <span class="p">(</span><span class="n">or</span> <span class="n">regex</span><span class="p">)</span> <span class="n">whose</span> <span class="n">presence</span> <span class="n">indicates</span> <span class="n">a</span> <span class="n">datum</span> <span class="k">for</span> <span class="n">the</span> <span class="n">primary</span> <span class="n">item</span> 
-        <span class="n">set</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> <span class="n">no</span> <span class="n">filter</span><span class="p">,</span> <span class="n">all</span> <span class="n">data</span> <span class="n">is</span> <span class="n">used</span>
+        <span class="n">String</span> <span class="p">(</span><span class="n">or</span> <span class="n">regex</span><span class="p">)</span> <span class="n">whose</span> <span class="n">presence</span> <span class="n">indicates</span> <span class="n">a</span> <span class="n">datum</span> <span class="k">for</span> <span class="n">the</span> <span class="n">primary</span> <span class="n">item</span> <span class="n">set</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> <span class="n">no</span> <span class="n">filter</span><span class="p">,</span> <span class="n">all</span> <span class="n">data</span> <span class="n">is</span> <span class="n">used</span>
   <span class="o">-</span><span class="n">f2</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">filter2</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">String</span> <span class="p">(</span><span class="n">or</span> <span class="n">regex</span><span class="p">)</span> <span class="n">whose</span> <span class="n">presence</span> <span class="n">indicates</span> <span class="n">a</span> <span class="n">datum</span> <span class="k">for</span> <span class="n">the</span> <span class="n">secondary</span> <span class="n">item</span> 
-        <span class="n">set</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">If</span> <span class="n">not</span> <span class="n">present</span> <span class="n">no</span> <span class="n">secondary</span> <span class="n">dataset</span> <span class="n">is</span> <span class="n">collected</span>
-  <span class="o">-</span><span class="n">rc</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">rowIDPosition</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Column</span> <span class="n">number</span> <span class="p">(</span>0 <span class="n">based</span> <span class="n">Int</span><span class="p">)</span> <span class="n">containing</span> <span class="n">the</span> <span class="n">row</span> <span class="n">ID</span> <span class="n">string</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> 
-        <span class="n">Default</span><span class="p">:</span> 0
-  <span class="o">-</span><span class="n">ic</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">itemIDPosition</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Column</span> <span class="n">number</span> <span class="p">(</span>0 <span class="n">based</span> <span class="n">Int</span><span class="p">)</span> <span class="n">containing</span> <span class="n">the</span> <span class="n">item</span> <span class="n">ID</span> <span class="n">string</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> 
-        <span class="n">Default</span><span class="p">:</span> 1
-  <span class="o">-</span><span class="n">fc</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">filterPosition</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Column</span> <span class="n">number</span> <span class="p">(</span>0 <span class="n">based</span> <span class="n">Int</span><span class="p">)</span> <span class="n">containing</span> <span class="n">the</span> <span class="n">filter</span> <span class="n">string</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> 
-        <span class="n">Default</span><span class="p">:</span> <span class="o">-</span>1 <span class="k">for</span> <span class="n">no</span> <span class="n">filter</span>
+        <span class="n">String</span> <span class="p">(</span><span class="n">or</span> <span class="n">regex</span><span class="p">)</span> <span class="n">whose</span> <span class="n">presence</span> <span class="n">indicates</span> <span class="n">a</span> <span class="n">datum</span> <span class="k">for</span> <span class="n">the</span> <span class="n">secondary</span> <span class="n">item</span> <span class="n">set</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">If</span> <span class="n">not</span> <span class="n">present</span> <span class="n">no</span> <span class="n">secondary</span> <span class="n">dataset</span> <span class="n">is</span> <span class="n">collected</span>
+  <span class="o">-</span><span class="n">rc</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">rowIDColumn</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
+        <span class="n">Column</span> <span class="n">number</span> <span class="p">(</span>0 <span class="n">based</span> <span class="n">Int</span><span class="p">)</span> <span class="n">containing</span> <span class="n">the</span> <span class="n">row</span> <span class="n">ID</span> <span class="n">string</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> 0
+  <span class="o">-</span><span class="n">ic</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">itemIDColumn</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
+        <span class="n">Column</span> <span class="n">number</span> <span class="p">(</span>0 <span class="n">based</span> <span class="n">Int</span><span class="p">)</span> <span class="n">containing</span> <span class="n">the</span> <span class="n">item</span> <span class="n">ID</span> <span class="n">string</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> 1
+  <span class="o">-</span><span class="n">fc</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">filterColumn</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
+        <span class="n">Column</span> <span class="n">number</span> <span class="p">(</span>0 <span class="n">based</span> <span class="n">Int</span><span class="p">)</span> <span class="n">containing</span> <span class="n">the</span> <span class="n">filter</span> <span class="n">string</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> <span class="o">-</span>1 <span class="k">for</span> <span class="n">no</span> <span class="n">filter</span>
 
 <span class="n">Using</span> <span class="n">all</span> <span class="n">defaults</span> <span class="n">the</span> <span class="n">input</span> <span class="n">is</span> <span class="n">expected</span> <span class="n">of</span> <span class="n">the</span> <span class="n">form</span><span class="p">:</span> &quot;<span class="n">userID</span><span class="o">&lt;</span><span class="n">tab</span><span class="o">&gt;</span><span class="n">itemId</span>&quot; <span class="n">or</span> &quot;<span class="n">userID</span><span class="o">&lt;</span><span class="n">tab</span><span class="o">&gt;</span><span class="n">itemID</span><span class="o">&lt;</span><span class="n">tab</span><span class="o">&gt;</span><span class="n">any</span><span class="o">-</span><span class="n">text</span><span class="p">...</span>&quot; <span class="n">and</span> <span class="n">all</span> <span class="n">rows</span> <span class="n">will</span> <span class="n">be</span> <span class="n">used</span>
 
 <span class="n">File</span> <span class="n">discovery</span> <span class="n">options</span><span class="p">:</span>
   <span class="o">-</span><span class="n">r</span> <span class="o">|</span> <span class="o">--</span><span class="n">recursive</span>
-        <span class="n">Searched</span> <span class="n">the</span> <span class="o">-</span><span class="nb">i</span> <span class="n">path</span> <span class="n">recursively</span> <span class="k">for</span> <span class="n">files</span> <span class="n">that</span> <span class="n">match</span> <span class="o">--</span><span class="n">filenamePattern</span> 
-        <span class="p">(</span><span class="n">optional</span><span class="p">),</span> <span class="n">default</span><span class="p">:</span> <span class="n">false</span>
+        <span class="n">Searched</span> <span class="n">the</span> <span class="o">-</span><span class="nb">i</span> <span class="n">path</span> <span class="n">recursively</span> <span class="k">for</span> <span class="n">files</span> <span class="n">that</span> <span class="n">match</span> <span class="o">--</span><span class="n">filenamePattern</span> <span class="p">(</span><span class="n">optional</span><span class="p">),</span> <span class="n">Default</span><span class="p">:</span> <span class="n">false</span>
   <span class="o">-</span><span class="n">fp</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">filenamePattern</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Regex</span> <span class="n">to</span> <span class="n">match</span> <span class="n">in</span> <span class="n">determining</span> <span class="n">input</span> <span class="n">files</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> <span class="n">filename</span> 
-        <span class="n">in</span> <span class="n">the</span> <span class="o">--</span><span class="n">input</span> <span class="n">option</span> <span class="n">or</span> &quot;^<span class="n">part</span><span class="o">-.*</span>&quot; <span class="k">if</span> <span class="o">--</span><span class="n">input</span> <span class="n">is</span> <span class="n">a</span> <span class="n">directory</span>
+        <span class="n">Regex</span> <span class="n">to</span> <span class="n">match</span> <span class="n">in</span> <span class="n">determining</span> <span class="n">input</span> <span class="n">files</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> <span class="n">filename</span> <span class="n">in</span> <span class="n">the</span> <span class="o">--</span><span class="n">input</span> <span class="n">option</span> <span class="n">or</span> &quot;^<span class="n">part</span><span class="o">-.*</span>&quot; <span class="k">if</span> <span class="o">--</span><span class="n">input</span> <span class="n">is</span> <span class="n">a</span> <span class="n">directory</span>
 
 <span class="n">Output</span> <span class="n">text</span> <span class="n">file</span> <span class="n">schema</span> <span class="n">options</span><span class="p">:</span>
   <span class="o">-</span><span class="n">rd</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">rowKeyDelim</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Separates</span> <span class="n">the</span> <span class="n">rowID</span> <span class="n">key</span> <span class="n">from</span> <span class="n">the</span> <span class="n">vector</span> <span class="n">values</span> <span class="n">list</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> 
-<span class="o">\</span><span class="n">t</span>&quot;
+        <span class="n">Separates</span> <span class="n">the</span> <span class="n">rowID</span> <span class="n">key</span> <span class="n">from</span> <span class="n">the</span> <span class="n">vector</span> <span class="n">values</span> <span class="n">list</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> &quot;<span class="o">\</span><span class="n">t</span>&quot;
   <span class="o">-</span><span class="n">cd</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">columnIdStrengthDelim</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Separates</span> <span class="n">column</span> <span class="n">IDs</span> <span class="n">from</span> <span class="n">their</span> <span class="n">values</span> <span class="n">in</span> <span class="n">the</span> <span class="n">vector</span> <span class="n">values</span> <span class="n">list</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> 
-        <span class="n">Default</span><span class="p">:</span> &quot;<span class="p">:</span>&quot;
+        <span class="n">Separates</span> <span class="n">column</span> <span class="n">IDs</span> <span class="n">from</span> <span class="n">their</span> <span class="n">values</span> <span class="n">in</span> <span class="n">the</span> <span class="n">vector</span> <span class="n">values</span> <span class="n">list</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> &quot;<span class="p">:</span>&quot;
   <span class="o">-</span><span class="n">td</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">elementDelim</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
         <span class="n">Separates</span> <span class="n">vector</span> <span class="n">element</span> <span class="n">values</span> <span class="n">in</span> <span class="n">the</span> <span class="n">values</span> <span class="n">list</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> &quot; &quot;
   <span class="o">-</span><span class="n">os</span> <span class="o">|</span> <span class="o">--</span><span class="n">omitStrength</span>
         <span class="n">Do</span> <span class="n">not</span> <span class="n">write</span> <span class="n">the</span> <span class="n">strength</span> <span class="n">to</span> <span class="n">the</span> <span class="n">output</span> <span class="n">files</span> <span class="p">(</span><span class="n">optional</span><span class="p">),</span> <span class="n">Default</span><span class="p">:</span> <span class="n">false</span><span class="p">.</span>
-        <span class="n">This</span> <span class="n">option</span> <span class="n">is</span> <span class="n">used</span> <span class="n">to</span> <span class="n">output</span> <span class="n">indexable</span> <span class="n">data</span> <span class="k">for</span> <span class="n">creating</span> <span class="n">a</span> <span class="n">search</span> <span class="n">engine</span> 
-        <span class="n">recommender</span><span class="p">.</span>
+<span class="n">This</span> <span class="n">option</span> <span class="n">is</span> <span class="n">used</span> <span class="n">to</span> <span class="n">output</span> <span class="n">indexable</span> <span class="n">data</span> <span class="k">for</span> <span class="n">creating</span> <span class="n">a</span> <span class="n">search</span> <span class="n">engine</span> <span class="n">recommender</span><span class="p">.</span>
 
 <span class="n">Default</span> <span class="n">delimiters</span> <span class="n">will</span> <span class="n">produce</span> <span class="n">output</span> <span class="n">of</span> <span class="n">the</span> <span class="n">form</span><span class="p">:</span> &quot;<span class="n">itemID1</span><span class="o">&lt;</span><span class="n">tab</span><span class="o">&gt;</span><span class="n">itemID2</span><span class="p">:</span><span class="n">value2</span><span class="o">&lt;</span><span class="n">space</span><span class="o">&gt;</span><span class="n">itemID10</span><span class="p">:</span><span class="n">value10</span><span class="p">...</span>&quot;
 
 <span class="n">Spark</span> <span class="n">config</span> <span class="n">options</span><span class="p">:</span>
   <span class="o">-</span><span class="n">ma</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">master</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Spark</span> <span class="n">Master</span> <span class="n">URL</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> &quot;<span class="n">local</span>&quot;<span class="p">.</span> <span class="n">Note</span> <span class="n">that</span> <span class="n">you</span> <span class="n">can</span> <span class="n">specify</span> 
-        <span class="n">the</span> <span class="n">number</span> <span class="n">of</span> <span class="n">cores</span> <span class="n">to</span> <span class="n">get</span> <span class="n">a</span> <span class="n">performance</span> <span class="n">improvement</span><span class="p">,</span> <span class="k">for</span> <span class="n">example</span> &quot;<span class="n">local</span><span class="p">[</span>4<span class="p">]</span>&quot;
+        <span class="n">Spark</span> <span class="n">Master</span> <span class="n">URL</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> &quot;<span class="n">local</span>&quot;<span class="p">.</span> <span class="n">Note</span> <span class="n">that</span> <span class="n">you</span> <span class="n">can</span> <span class="n">specify</span> <span class="n">the</span> <span class="n">number</span> <span class="n">of</span> <span class="n">cores</span> <span class="n">to</span> <span class="n">get</span> <span class="n">a</span> <span class="n">performance</span> <span class="n">improvement</span><span class="p">,</span> <span class="k">for</span> <span class="n">example</span> &quot;<span class="n">local</span><span class="p">[</span>4<span class="p">]</span>&quot;
   <span class="o">-</span><span class="n">sem</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">sparkExecutorMem</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Max</span> <span class="n">Java</span> <span class="n">heap</span> <span class="n">available</span> <span class="n">as</span> &quot;<span class="n">executor</span> <span class="n">memory</span>&quot; <span class="n">on</span> <span class="n">each</span> <span class="n">node</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> 
-        <span class="n">Default</span><span class="p">:</span> 4<span class="n">g</span>
-
-<span class="n">General</span> <span class="n">config</span> <span class="n">options</span><span class="p">:</span>
+        <span class="n">Max</span> <span class="n">Java</span> <span class="n">heap</span> <span class="n">available</span> <span class="n">as</span> &quot;<span class="n">executor</span> <span class="n">memory</span>&quot; <span class="n">on</span> <span class="n">each</span> <span class="n">node</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> 4<span class="n">g</span>
   <span class="o">-</span><span class="n">rs</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">randomSeed</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
 
   <span class="o">-</span><span class="n">h</span> <span class="o">|</span> <span class="o">--</span><span class="n">help</span>
@@ -472,61 +456,48 @@ to recommend.   </p>
 <p><em>spark-rowsimilarity</em> is the companion to <em>spark-itemsimilarity</em> the primary difference is that it takes a text file version of a DRM with optional application specific IDs. The input is in text-delimited form where there are three delimiters used. By default it reads (rowID<tab>columnID1:strength1<space>columnID2:strength2...) Since this job only supports LLR similarity, which does not use the input strengths, they may be omitted in the input. It writes (columnID<tab>columnID1:strength1<space>columnID2:strength2...) The output is sorted by strength descending. The output can be interpreted as a column id from the primary input followed by a list of the most similar columns. For a discussion of the output layout and formatting see <em>spark-itemsimilarity</em>. </p>
 <p>One significant output option is --omitStrength. This allows output of the form (columnID<tab>columnID2<space>columnID2<space>...) This is a tab-delimited file containing a columnID token followed by a space delimited string of tokens. It can be directly indexed by search engines to create an item-based recommender.</p>
 <p>The command line interface is:</p>
-<div class="codehilite"><pre><span class="n">spark</span><span class="o">-</span><span class="n">rowsimilarity</span> <span class="n">Mahout</span> 1<span class="p">.</span>0<span class="o">-</span><span class="n">SNAPSHOT</span>
+<div class="codehilite"><pre><span class="n">spark</span><span class="o">-</span><span class="n">rowsimilarity</span> <span class="n">Mahout</span> 1<span class="p">.</span>0
 <span class="n">Usage</span><span class="p">:</span> <span class="n">spark</span><span class="o">-</span><span class="n">rowsimilarity</span> <span class="p">[</span><span class="n">options</span><span class="p">]</span>
 
 <span class="n">Input</span><span class="p">,</span> <span class="n">output</span> <span class="n">options</span>
   <span class="o">-</span><span class="nb">i</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">input</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Input</span> <span class="n">path</span><span class="p">,</span> <span class="n">may</span> <span class="n">be</span> <span class="n">a</span> <span class="n">filename</span><span class="p">,</span> <span class="n">directory</span> <span class="n">name</span><span class="p">,</span> <span class="n">or</span> <span class="n">comma</span> <span class="n">delimited</span> <span class="n">list</span> 
-        <span class="n">of</span> <span class="n">HDFS</span> <span class="n">supported</span> <span class="n">URIs</span> <span class="p">(</span><span class="n">required</span><span class="p">)</span>
- <span class="o">-</span><span class="n">o</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">output</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
+        <span class="n">Input</span> <span class="n">path</span><span class="p">,</span> <span class="n">may</span> <span class="n">be</span> <span class="n">a</span> <span class="n">filename</span><span class="p">,</span> <span class="n">directory</span> <span class="n">name</span><span class="p">,</span> <span class="n">or</span> <span class="n">comma</span> <span class="n">delimited</span> <span class="n">list</span> <span class="n">of</span> <span class="n">HDFS</span> <span class="n">supported</span> <span class="n">URIs</span> <span class="p">(</span><span class="n">required</span><span class="p">)</span>
+  <span class="o">-</span><span class="n">o</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">output</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
         <span class="n">Path</span> <span class="k">for</span> <span class="n">output</span><span class="p">,</span> <span class="n">any</span> <span class="n">local</span> <span class="n">or</span> <span class="n">HDFS</span> <span class="n">supported</span> <span class="n">URI</span> <span class="p">(</span><span class="n">required</span><span class="p">)</span>
 
 <span class="n">Algorithm</span> <span class="n">control</span> <span class="n">options</span><span class="p">:</span>
   <span class="o">-</span><span class="n">mo</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">maxObservations</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
         <span class="n">Max</span> <span class="n">number</span> <span class="n">of</span> <span class="n">observations</span> <span class="n">to</span> <span class="n">consider</span> <span class="n">per</span> <span class="n">row</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> 500
   <span class="o">-</span><span class="n">m</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">maxSimilaritiesPerRow</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Limit</span> <span class="n">the</span> <span class="n">number</span> <span class="n">of</span> <span class="n">similarities</span> <span class="n">per</span> <span class="n">item</span> <span class="n">to</span> <span class="n">this</span> <span class="n">number</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> 
-        <span class="n">Default</span><span class="p">:</span> 100
+        <span class="n">Limit</span> <span class="n">the</span> <span class="n">number</span> <span class="n">of</span> <span class="n">similarities</span> <span class="n">per</span> <span class="n">item</span> <span class="n">to</span> <span class="n">this</span> <span class="n">number</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> 100
 
 <span class="n">Note</span><span class="p">:</span> <span class="n">Only</span> <span class="n">the</span> <span class="n">Log</span> <span class="n">Likelihood</span> <span class="n">Ratio</span> <span class="p">(</span><span class="n">LLR</span><span class="p">)</span> <span class="n">is</span> <span class="n">supported</span> <span class="n">as</span> <span class="n">a</span> <span class="n">similarity</span> <span class="n">measure</span><span class="p">.</span>
+<span class="n">Disconnected</span> <span class="n">from</span> <span class="n">the</span> <span class="n">target</span> <span class="n">VM</span><span class="p">,</span> <span class="n">address</span><span class="p">:</span> <span class="s">&#39;127.0.0.1:49162&#39;</span><span class="p">,</span> <span class="n">transport</span><span class="p">:</span> <span class="s">&#39;socket&#39;</span>
 
 <span class="n">Output</span> <span class="n">text</span> <span class="n">file</span> <span class="n">schema</span> <span class="n">options</span><span class="p">:</span>
   <span class="o">-</span><span class="n">rd</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">rowKeyDelim</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Separates</span> <span class="n">the</span> <span class="n">rowID</span> <span class="n">key</span> <span class="n">from</span> <span class="n">the</span> <span class="n">vector</span> <span class="n">values</span> <span class="n">list</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> 
-        <span class="n">Default</span><span class="p">:</span> &quot;<span class="o">\</span><span class="n">t</span>&quot;
+        <span class="n">Separates</span> <span class="n">the</span> <span class="n">rowID</span> <span class="n">key</span> <span class="n">from</span> <span class="n">the</span> <span class="n">vector</span> <span class="n">values</span> <span class="n">list</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> &quot;<span class="o">\</span><span class="n">t</span>&quot;
   <span class="o">-</span><span class="n">cd</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">columnIdStrengthDelim</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Separates</span> <span class="n">column</span> <span class="n">IDs</span> <span class="n">from</span> <span class="n">their</span> <span class="n">values</span> <span class="n">in</span> <span class="n">the</span> <span class="n">vector</span> <span class="n">values</span> <span class="n">list</span> 
-        <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> &quot;<span class="p">:</span>&quot;
+        <span class="n">Separates</span> <span class="n">column</span> <span class="n">IDs</span> <span class="n">from</span> <span class="n">their</span> <span class="n">values</span> <span class="n">in</span> <span class="n">the</span> <span class="n">vector</span> <span class="n">values</span> <span class="n">list</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> &quot;<span class="p">:</span>&quot;
   <span class="o">-</span><span class="n">td</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">elementDelim</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Separates</span> <span class="n">vector</span> <span class="n">element</span> <span class="n">values</span> <span class="n">in</span> <span class="n">the</span> <span class="n">values</span> <span class="n">list</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> 
-        <span class="n">Default</span><span class="p">:</span> &quot; &quot;
+        <span class="n">Separates</span> <span class="n">vector</span> <span class="n">element</span> <span class="n">values</span> <span class="n">in</span> <span class="n">the</span> <span class="n">values</span> <span class="n">list</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> &quot; &quot;
   <span class="o">-</span><span class="n">os</span> <span class="o">|</span> <span class="o">--</span><span class="n">omitStrength</span>
-        <span class="n">Do</span> <span class="n">not</span> <span class="n">write</span> <span class="n">the</span> <span class="n">strength</span> <span class="n">to</span> <span class="n">the</span> <span class="n">output</span> <span class="n">files</span> <span class="p">(</span><span class="n">optional</span><span class="p">),</span> <span class="n">Default</span><span class="p">:</span> 
-        <span class="n">false</span><span class="p">.</span>
-<span class="n">This</span> <span class="n">option</span> <span class="n">is</span> <span class="n">used</span> <span class="n">to</span> <span class="n">output</span> <span class="n">indexable</span> <span class="n">data</span> <span class="k">for</span> <span class="n">creating</span> <span class="n">a</span> <span class="n">search</span> <span class="n">engine</span> 
-<span class="n">recommender</span><span class="p">.</span>
+        <span class="n">Do</span> <span class="n">not</span> <span class="n">write</span> <span class="n">the</span> <span class="n">strength</span> <span class="n">to</span> <span class="n">the</span> <span class="n">output</span> <span class="n">files</span> <span class="p">(</span><span class="n">optional</span><span class="p">),</span> <span class="n">Default</span><span class="p">:</span> <span class="n">false</span><span class="p">.</span>
+<span class="n">This</span> <span class="n">option</span> <span class="n">is</span> <span class="n">used</span> <span class="n">to</span> <span class="n">output</span> <span class="n">indexable</span> <span class="n">data</span> <span class="k">for</span> <span class="n">creating</span> <span class="n">a</span> <span class="n">search</span> <span class="n">engine</span> <span class="n">recommender</span><span class="p">.</span>
 
 <span class="n">Default</span> <span class="n">delimiters</span> <span class="n">will</span> <span class="n">produce</span> <span class="n">output</span> <span class="n">of</span> <span class="n">the</span> <span class="n">form</span><span class="p">:</span> &quot;<span class="n">itemID1</span><span class="o">&lt;</span><span class="n">tab</span><span class="o">&gt;</span><span class="n">itemID2</span><span class="p">:</span><span class="n">value2</span><span class="o">&lt;</span><span class="n">space</span><span class="o">&gt;</span><span class="n">itemID10</span><span class="p">:</span><span class="n">value10</span><span class="p">...</span>&quot;
 
 <span class="n">File</span> <span class="n">discovery</span> <span class="n">options</span><span class="p">:</span>
   <span class="o">-</span><span class="n">r</span> <span class="o">|</span> <span class="o">--</span><span class="n">recursive</span>
-        <span class="n">Searched</span> <span class="n">the</span> <span class="o">-</span><span class="nb">i</span> <span class="n">path</span> <span class="n">recursively</span> <span class="k">for</span> <span class="n">files</span> <span class="n">that</span> <span class="n">match</span> 
-        <span class="o">--</span><span class="n">filenamePattern</span> <span class="p">(</span><span class="n">optional</span><span class="p">),</span> <span class="n">Default</span><span class="p">:</span> <span class="n">false</span>
+        <span class="n">Searched</span> <span class="n">the</span> <span class="o">-</span><span class="nb">i</span> <span class="n">path</span> <span class="n">recursively</span> <span class="k">for</span> <span class="n">files</span> <span class="n">that</span> <span class="n">match</span> <span class="o">--</span><span class="n">filenamePattern</span> <span class="p">(</span><span class="n">optional</span><span class="p">),</span> <span class="n">Default</span><span class="p">:</span> <span class="n">false</span>
   <span class="o">-</span><span class="n">fp</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">filenamePattern</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Regex</span> <span class="n">to</span> <span class="n">match</span> <span class="n">in</span> <span class="n">determining</span> <span class="n">input</span> <span class="n">files</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> 
-        <span class="n">filename</span> <span class="n">in</span> <span class="n">the</span> <span class="o">--</span><span class="n">input</span> <span class="n">option</span> <span class="n">or</span> &quot;^<span class="n">part</span><span class="o">-.*</span>&quot; <span class="k">if</span> <span class="o">--</span><span class="n">input</span> <span class="n">is</span> <span class="n">a</span> <span class="n">directory</span>
+        <span class="n">Regex</span> <span class="n">to</span> <span class="n">match</span> <span class="n">in</span> <span class="n">determining</span> <span class="n">input</span> <span class="n">files</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> <span class="n">filename</span> <span class="n">in</span> <span class="n">the</span> <span class="o">--</span><span class="n">input</span> <span class="n">option</span> <span class="n">or</span> &quot;^<span class="n">part</span><span class="o">-.*</span>&quot; <span class="k">if</span> <span class="o">--</span><span class="n">input</span> <span class="n">is</span> <span class="n">a</span> <span class="n">directory</span>
 
 <span class="n">Spark</span> <span class="n">config</span> <span class="n">options</span><span class="p">:</span>
   <span class="o">-</span><span class="n">ma</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">master</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Spark</span> <span class="n">Master</span> <span class="n">URL</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> &quot;<span class="n">local</span>&quot;<span class="p">.</span> <span class="n">Note</span> <span class="n">that</span> <span class="n">you</span> <span class="n">can</span> 
-        <span class="n">specify</span> <span class="n">the</span> <span class="n">number</span> <span class="n">of</span> <span class="n">cores</span> <span class="n">to</span> <span class="n">get</span> <span class="n">a</span> <span class="n">performance</span> <span class="n">improvement</span><span class="p">,</span> <span class="k">for</span> 
-        <span class="n">example</span> &quot;<span class="n">local</span><span class="p">[</span>4<span class="p">]</span>&quot;
+        <span class="n">Spark</span> <span class="n">Master</span> <span class="n">URL</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> &quot;<span class="n">local</span>&quot;<span class="p">.</span> <span class="n">Note</span> <span class="n">that</span> <span class="n">you</span> <span class="n">can</span> <span class="n">specify</span> <span class="n">the</span> <span class="n">number</span> <span class="n">of</span> <span class="n">cores</span> <span class="n">to</span> <span class="n">get</span> <span class="n">a</span> <span class="n">performance</span> <span class="n">improvement</span><span class="p">,</span> <span class="k">for</span> <span class="n">example</span> &quot;<span class="n">local</span><span class="p">[</span>4<span class="p">]</span>&quot;
   <span class="o">-</span><span class="n">sem</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">sparkExecutorMem</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
-        <span class="n">Max</span> <span class="n">Java</span> <span class="n">heap</span> <span class="n">available</span> <span class="n">as</span> &quot;<span class="n">executor</span> <span class="n">memory</span>&quot; <span class="n">on</span> <span class="n">each</span> <span class="n">node</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> 
-        <span class="n">Default</span><span class="p">:</span> 4<span class="n">g</span>
-
-<span class="n">General</span> <span class="n">config</span> <span class="n">options</span><span class="p">:</span>
+        <span class="n">Max</span> <span class="n">Java</span> <span class="n">heap</span> <span class="n">available</span> <span class="n">as</span> &quot;<span class="n">executor</span> <span class="n">memory</span>&quot; <span class="n">on</span> <span class="n">each</span> <span class="n">node</span> <span class="p">(</span><span class="n">optional</span><span class="p">).</span> <span class="n">Default</span><span class="p">:</span> 4<span class="n">g</span>
   <span class="o">-</span><span class="n">rs</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span> <span class="o">|</span> <span class="o">--</span><span class="n">randomSeed</span> <span class="o">&lt;</span><span class="n">value</span><span class="o">&gt;</span>
 
   <span class="o">-</span><span class="n">h</span> <span class="o">|</span> <span class="o">--</span><span class="n">help</span>



Mime
View raw message