spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sro...@apache.org
Subject svn commit: r1730185 - /spark/site/examples.html
Date Sat, 13 Feb 2016 11:36:19 GMT
Author: srowen
Date: Sat Feb 13 11:36:18 2016
New Revision: 1730185

URL: http://svn.apache.org/viewvc?rev=1730185&view=rev
Log:
Regenerate examples.html to correct "Liquid error: pygments"

Modified:
    spark/site/examples.html

Modified: spark/site/examples.html
URL: http://svn.apache.org/viewvc/spark/site/examples.html?rev=1730185&r1=1730184&r2=1730185&view=diff
==============================================================================
--- spark/site/examples.html (original)
+++ spark/site/examples.html Sat Feb 13 11:36:18 2016
@@ -199,19 +199,43 @@ In this page, we will show examples usin
 <div class="tab-content">
 <div class="tab-pane tab-pane-python active">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">text_file</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">textFile</span><span class="p">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="p">)</span>
+<span class="n">counts</span> <span class="o">=</span> <span class="n">text_file</span><span class="o">.</span><span class="n">flatMap</span><span class="p">(</span><span class="k">lambda</span> <span class="n">line</span><span class="p">:</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s">&quot; &quot;</span><span class="p">))</span> \
+             <span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">word</span><span class="p">:</span> <span class="p">(</span><span class="n">word</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span> \
+             <span class="o">.</span><span class="n">reduceByKey</span><span class="p">(</span><span class="k">lambda</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">:</span> <span class="n">a</span> <span class="o">+</span> <span class="n">b</span><span class="p">)</span>
+<span class="n">counts</span><span class="o">.</span><span class="n">saveAsTextFile</span><span class="p">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="p">)</span></code></pre></figure>
+
 </div>
 </div>
 
 <div class="tab-pane tab-pane-scala">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">val</span> <span class="n">textFile</span> <span class="k">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">textFile</span><span class="o">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="o">)</span>
+<span class="k">val</span> <span class="n">counts</span> <span class="k">=</span> <span class="n">textFile</span><span class="o">.</span><span class="n">flatMap</span><span class="o">(</span><span class="n">line</span> <span class="k">=&gt;</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="o">(</span><span class="s">&quot; &quot;</span><span class="o">))</span>
+                 <span class="o">.</span><span class="n">map</span><span class="o">(</span><span class="n">word</span> <span class="k">=&gt;</span> <span class="o">(</span><span class="n">word</span><span class="o">,</span> <span class="mi">1</span><span class="o">))</span>
+                 <span class="o">.</span><span class="n">reduceByKey</span><span class="o">(</span><span class="k">_</span> <span class="o">+</span> <span class="k">_</span><span class="o">)</span>
+<span class="n">counts</span><span class="o">.</span><span class="n">saveAsTextFile</span><span class="o">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="o">)</span></code></pre></figure>
+
 </div>
 </div>
 
 <div class="tab-pane tab-pane-java">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">JavaRDD</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">textFile</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="na">textFile</span><span class="o">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="o">);</span>
+<span class="n">JavaRDD</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">words</span> <span class="o">=</span> <span class="n">textFile</span><span class="o">.</span><span class="na">flatMap</span><span class="o">(</span><span class="k">new</span> <span class="n">FlatMapFunction</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">String</span><span class="o">&gt;()</span> <span class="o">{</span>
+  <span class="kd">public</span> <span class="n">Iterable</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="nf">call</span><span class="o">(</span><span class="n">String</span> <span class="n">s</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="n">Arrays</span><span class="o">.</span><span class="na">asList</span><span class="o">(</span><span class="n">s</span><span class="o">.</span><span class="na">split</span><span class="o">(</span><span class="s">&quot; &quot;</span><span class="o">));</span> <span class="o">}</span>
+<span class="o">});</span>
+<span class="n">JavaPairRDD</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;</span> <span class="n">pairs</span> <span class="o">=</span> <span class="n">words</span><span class="o">.</span><span class="na">mapToPair</span><span class="o">(</span><span class="k">new</span> <span class="n">PairFunction</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;()</span> <span class="o">{</span>
+  <span class="kd">public</span> <span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;</span> <span class="nf">call</span><span class="o">(</span><span class="n">String</span> <span class="n">s</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="k">new</span> <span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;(</span><span class="n">s</span><span class="o">,</span> <span class="mi">1</span><span class="o">);</span> <span class="o">}</span>
+<span class="o">});</span>
+<span class="n">JavaPairRDD</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;</span> <span class="n">counts</span> <span class="o">=</span> <span class="n">pairs</span><span class="o">.</span><span class="na">reduceByKey</span><span class="o">(</span><span class="k">new</span> <span class="n">Function2</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span> <span class="n">Integer</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;()</span> <span class="o">{</span>
+  <span class="kd">public</span> <span class="n">Integer</span> <span class="nf">call</span><span class="o">(</span><span class="n">Integer</span> <span class="n">a</span><span class="o">,</span> <span class="n">Integer</span> <span class="n">b</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="n">a</span> <span class="o">+</span> <span class="n">b</span><span class="o">;</span> <span class="o">}</span>
+<span class="o">});</span>
+<span class="n">counts</span><span class="o">.</span><span class="na">saveAsTextFile</span><span class="o">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="o">);</span></code></pre></figure>
+
 </div>
 </div>
 </div>
@@ -228,19 +252,48 @@ Liquid error: pygments
 <div class="tab-content">
 <div class="tab-pane tab-pane-python active">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="k">def</span> <span class="nf">sample</span><span class="p">(</span><span class="n">p</span><span class="p">):</span>
+    <span class="n">x</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">random</span><span class="p">(),</span> <span class="n">random</span><span class="p">()</span>
+    <span class="k">return</span> <span class="mi">1</span> <span class="k">if</span> <span class="n">x</span><span class="o">*</span><span class="n">x</span> <span class="o">+</span> <span class="n">y</span><span class="o">*</span><span class="n">y</span> <span class="o">&lt;</span> <span class="mi">1</span> <span class="k">else</span> <span class="mi">0</span>
+
+<span class="n">count</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="p">(</span><span class="nb">xrange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">NUM_SAMPLES</span><span class="p">))</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">sample</span><span class="p">)</span> \
+             <span class="o">.</span><span class="n">reduce</span><span class="p">(</span><span class="k">lambda</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">:</span> <span class="n">a</span> <span class="o">+</span> <span class="n">b</span><span class="p">)</span>
+<span class="k">print</span> <span class="s">&quot;Pi is roughly </span><span class="si">%f</span><span class="s">&quot;</span> <span class="o">%</span> <span class="p">(</span><span class="mf">4.0</span> <span class="o">*</span> <span class="n">count</span> <span class="o">/</span> <span class="n">NUM_SAMPLES</span><span class="p">)</span></code></pre></figure>
+
 </div>
 </div>
 
 <div class="tab-pane tab-pane-scala">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">val</span> <span class="n">count</span> <span class="k">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="o">(</span><span class="mi">1</span> <span class="n">to</span> <span class="nc">NUM_SAMPLES</span><span class="o">).</span><span class="n">map</span><span class="o">{</span><span class="n">i</span> <span class="k">=&gt;</span>
+  <span class="k">val</span> <span class="n">x</span> <span class="k">=</span> <span class="nc">Math</span><span class="o">.</span><span class="n">random</span><span class="o">()</span>
+  <span class="k">val</span> <span class="n">y</span> <span class="k">=</span> <span class="nc">Math</span><span class="o">.</span><span class="n">random</span><span class="o">()</span>
+  <span class="k">if</span> <span class="o">(</span><span class="n">x</span><span class="o">*</span><span class="n">x</span> <span class="o">+</span> <span class="n">y</span><span class="o">*</span><span class="n">y</span> <span class="o">&lt;</span> <span class="mi">1</span><span class="o">)</span> <span class="mi">1</span> <span class="k">else</span> <span class="mi">0</span>
+<span class="o">}.</span><span class="n">reduce</span><span class="o">(</span><span class="k">_</span> <span class="o">+</span> <span class="k">_</span><span class="o">)</span>
+<span class="n">println</span><span class="o">(</span><span class="s">&quot;Pi is roughly &quot;</span> <span class="o">+</span> <span class="mf">4.0</span> <span class="o">*</span> <span class="n">count</span> <span class="o">/</span> <span class="nc">NUM_SAMPLES</span><span class="o">)</span></code></pre></figure>
+
 </div>
 </div>
 
 <div class="tab-pane tab-pane-java">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">List</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">l</span> <span class="o">=</span> <span class="k">new</span> <span class="n">ArrayList</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;(</span><span class="n">NUM_SAMPLES</span><span class="o">);</span>
+<span class="k">for</span> <span class="o">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">NUM_SAMPLES</span><span class="o">;</span> <span class="n">i</span><span class="o">++)</span> <span class="o">{</span>
+  <span class="n">l</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">i</span><span class="o">);</span>
+<span class="o">}</span>
+
+<span class="kt">long</span> <span class="n">count</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="na">parallelize</span><span class="o">(</span><span class="n">l</span><span class="o">).</span><span class="na">filter</span><span class="o">(</span><span class="k">new</span> <span class="n">Function</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span> <span class="n">Boolean</span><span class="o">&gt;()</span> <span class="o">{</span>
+  <span class="kd">public</span> <span class="n">Boolean</span> <span class="nf">call</span><span class="o">(</span><span class="n">Integer</span> <span class="n">i</span><span class="o">)</span> <span class="o">{</span>
+    <span class="kt">double</span> <span class="n">x</span> <span class="o">=</span> <span class="n">Math</span><span class="o">.</span><span class="na">random</span><span class="o">();</span>
+    <span class="kt">double</span> <span class="n">y</span> <span class="o">=</span> <span class="n">Math</span><span class="o">.</span><span class="na">random</span><span class="o">();</span>
+    <span class="k">return</span> <span class="n">x</span><span class="o">*</span><span class="n">x</span> <span class="o">+</span> <span class="n">y</span><span class="o">*</span><span class="n">y</span> <span class="o">&lt;</span> <span class="mi">1</span><span class="o">;</span>
+  <span class="o">}</span>
+<span class="o">}).</span><span class="na">count</span><span class="o">();</span>
+<span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">&quot;Pi is roughly &quot;</span> <span class="o">+</span> <span class="mf">4.0</span> <span class="o">*</span> <span class="n">count</span> <span class="o">/</span> <span class="n">NUM_SAMPLES</span><span class="o">);</span></code></pre></figure>
+
 </div>
 </div>
 </div>
@@ -266,19 +319,64 @@ Also, programs based on DataFrame API wi
 <div class="tab-content">
 <div class="tab-pane tab-pane-python active">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">textFile</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">textFile</span><span class="p">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="p">)</span>
+
+<span class="c"># Creates a DataFrame having a single column named &quot;line&quot;</span>
+<span class="n">df</span> <span class="o">=</span> <span class="n">textFile</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">r</span><span class="p">:</span> <span class="n">Row</span><span class="p">(</span><span class="n">r</span><span class="p">))</span><span class="o">.</span><span class="n">toDF</span><span class="p">([</span><span class="s">&quot;line&quot;</span><span class="p">])</span>
+<span class="n">errors</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="n">col</span><span class="p">(</span><span class="s">&quot;line&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">like</span><span class="p">(</span><span class="s">&quot;</span><span class="si">%E</span><span class="s">RROR%&quot;</span><span class="p">))</span>
+<span class="c"># Counts all the errors</span>
+<span class="n">errors</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
+<span class="c"># Counts errors mentioning MySQL</span>
+<span class="n">errors</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="n">col</span><span class="p">(</span><span class="s">&quot;line&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">like</span><span class="p">(</span><span class="s">&quot;%MySQL%&quot;</span><span class="p">))</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
+<span class="c"># Fetches the MySQL errors as an array of strings</span>
+<span class="n">errors</span><span class="o">.</span><span class="n">filter</span><span class="p">(</span><span class="n">col</span><span class="p">(</span><span class="s">&quot;line&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">like</span><span class="p">(</span><span class="s">&quot;%MySQL%&quot;</span><span class="p">))</span><span class="o">.</span><span class="n">collect</span><span class="p">()</span></code></pre></figure>
+
 </div>
 </div>
 
 <div class="tab-pane tab-pane-scala">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">val</span> <span class="n">textFile</span> <span class="k">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">textFile</span><span class="o">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="o">)</span>
+
+<span class="c1">// Creates a DataFrame having a single column named &quot;line&quot;</span>
+<span class="k">val</span> <span class="n">df</span> <span class="k">=</span> <span class="n">textFile</span><span class="o">.</span><span class="n">toDF</span><span class="o">(</span><span class="s">&quot;line&quot;</span><span class="o">)</span>
+<span class="k">val</span> <span class="n">errors</span> <span class="k">=</span> <span class="n">df</span><span class="o">.</span><span class="n">filter</span><span class="o">(</span><span class="n">col</span><span class="o">(</span><span class="s">&quot;line&quot;</span><span class="o">).</span><span class="n">like</span><span class="o">(</span><span class="s">&quot;%ERROR%&quot;</span><span class="o">))</span>
+<span class="c1">// Counts all the errors</span>
+<span class="n">errors</span><span class="o">.</span><span class="n">count</span><span class="o">()</span>
+<span class="c1">// Counts errors mentioning MySQL</span>
+<span class="n">errors</span><span class="o">.</span><span class="n">filter</span><span class="o">(</span><span class="n">col</span><span class="o">(</span><span class="s">&quot;line&quot;</span><span class="o">).</span><span class="n">like</span><span class="o">(</span><span class="s">&quot;%MySQL%&quot;</span><span class="o">)).</span><span class="n">count</span><span class="o">()</span>
+<span class="c1">// Fetches the MySQL errors as an array of strings</span>
+<span class="n">errors</span><span class="o">.</span><span class="n">filter</span><span class="o">(</span><span class="n">col</span><span class="o">(</span><span class="s">&quot;line&quot;</span><span class="o">).</span><span class="n">like</span><span class="o">(</span><span class="s">&quot;%MySQL%&quot;</span><span class="o">)).</span><span class="n">collect</span><span class="o">()</span></code></pre></figure>
+
 </div>
 </div>
 
 <div class="tab-pane tab-pane-java">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="c1">// Creates a DataFrame having a single column named &quot;line&quot;</span>
+<span class="n">JavaRDD</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">textFile</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="na">textFile</span><span class="o">(</span><span class="s">&quot;hdfs://...&quot;</span><span class="o">);</span>
+<span class="n">JavaRDD</span><span class="o">&lt;</span><span class="n">Row</span><span class="o">&gt;</span> <span class="n">rowRDD</span> <span class="o">=</span> <span class="n">textFile</span><span class="o">.</span><span class="na">map</span><span class="o">(</span>
+  <span class="k">new</span> <span class="n">Function</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Row</span><span class="o">&gt;()</span> <span class="o">{</span>
+    <span class="kd">public</span> <span class="n">Row</span> <span class="nf">call</span><span class="o">(</span><span class="n">String</span> <span class="n">line</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">Exception</span> <span class="o">{</span>
+      <span class="k">return</span> <span class="n">RowFactory</span><span class="o">.</span><span class="na">create</span><span class="o">(</span><span class="n">line</span><span class="o">);</span>
+    <span class="o">}</span>
+  <span class="o">});</span>
+<span class="n">List</span><span class="o">&lt;</span><span class="n">StructField</span><span class="o">&gt;</span> <span class="n">fields</span> <span class="o">=</span> <span class="k">new</span> <span class="n">ArrayList</span><span class="o">&lt;</span><span class="n">StructField</span><span class="o">&gt;();</span>
+<span class="n">fields</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">DataTypes</span><span class="o">.</span><span class="na">createStructField</span><span class="o">(</span><span class="s">&quot;line&quot;</span><span class="o">,</span> <span class="n">DataTypes</span><span class="o">.</span><span class="na">StringType</span><span class="o">,</span> <span class="kc">true</span><span class="o">));</span>
+<span class="n">StructType</span> <span class="n">schema</span> <span class="o">=</span> <span class="n">DataTypes</span><span class="o">.</span><span class="na">createStructType</span><span class="o">(</span><span class="n">fields</span><span class="o">);</span>
+<span class="n">DataFrame</span> <span class="n">df</span> <span class="o">=</span> <span class="n">sqlContext</span><span class="o">.</span><span class="na">createDataFrame</span><span class="o">(</span><span class="n">rowRDD</span><span class="o">,</span> <span class="n">schema</span><span class="o">);</span>
+
+<span class="n">DataFrame</span> <span class="n">errors</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="na">filter</span><span class="o">(</span><span class="n">col</span><span class="o">(</span><span class="s">&quot;line&quot;</span><span class="o">).</span><span class="na">like</span><span class="o">(</span><span class="s">&quot;%ERROR%&quot;</span><span class="o">));</span>
+<span class="c1">// Counts all the errors</span>
+<span class="n">errors</span><span class="o">.</span><span class="na">count</span><span class="o">();</span>
+<span class="c1">// Counts errors mentioning MySQL</span>
+<span class="n">errors</span><span class="o">.</span><span class="na">filter</span><span class="o">(</span><span class="n">col</span><span class="o">(</span><span class="s">&quot;line&quot;</span><span class="o">).</span><span class="na">like</span><span class="o">(</span><span class="s">&quot;%MySQL%&quot;</span><span class="o">)).</span><span class="na">count</span><span class="o">();</span>
+<span class="c1">// Fetches the MySQL errors as an array of strings</span>
+<span class="n">errors</span><span class="o">.</span><span class="na">filter</span><span class="o">(</span><span class="n">col</span><span class="o">(</span><span class="s">&quot;line&quot;</span><span class="o">).</span><span class="na">like</span><span class="o">(</span><span class="s">&quot;%MySQL%&quot;</span><span class="o">)).</span><span class="na">collect</span><span class="o">();</span></code></pre></figure>
+
 </div>
 </div>
 </div>
@@ -300,19 +398,82 @@ A simple MySQL table "people" is used in
 <div class="tab-content">
 <div class="tab-pane tab-pane-python active">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="c"># Creates a DataFrame based on a table named &quot;people&quot;</span>
+<span class="c"># stored in a MySQL database.</span>
+<span class="n">url</span> <span class="o">=</span> \
+  <span class="s">&quot;jdbc:mysql://yourIP:yourPort/test?user=yourUsername;password=yourPassword&quot;</span>
+<span class="n">df</span> <span class="o">=</span> <span class="n">sqlContext</span> \
+  <span class="o">.</span><span class="n">read</span> \
+  <span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="s">&quot;jdbc&quot;</span><span class="p">)</span> \
+  <span class="o">.</span><span class="n">option</span><span class="p">(</span><span class="s">&quot;url&quot;</span><span class="p">,</span> <span class="n">url</span><span class="p">)</span> \
+  <span class="o">.</span><span class="n">option</span><span class="p">(</span><span class="s">&quot;dbtable&quot;</span><span class="p">,</span> <span class="s">&quot;people&quot;</span><span class="p">)</span> \
+  <span class="o">.</span><span class="n">load</span><span class="p">()</span>
+
+<span class="c"># Looks the schema of this DataFrame.</span>
+<span class="n">df</span><span class="o">.</span><span class="n">printSchema</span><span class="p">()</span>
+
+<span class="c"># Counts people by age</span>
+<span class="n">countsByAge</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="n">groupBy</span><span class="p">(</span><span class="s">&quot;age&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">count</span><span class="p">()</span>
+<span class="n">countsByAge</span><span class="o">.</span><span class="n">show</span><span class="p">()</span>
+
+<span class="c"># Saves countsByAge to S3 in the JSON format.</span>
+<span class="n">countsByAge</span><span class="o">.</span><span class="n">write</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="s">&quot;json&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="s">&quot;s3a://...&quot;</span><span class="p">)</span></code></pre></figure>
+
 </div>
 </div>
 
 <div class="tab-pane tab-pane-scala">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="c1">// Creates a DataFrame based on a table named &quot;people&quot;</span>
+<span class="c1">// stored in a MySQL database.</span>
+<span class="k">val</span> <span class="n">url</span> <span class="k">=</span>
+  <span class="s">&quot;jdbc:mysql://yourIP:yourPort/test?user=yourUsername;password=yourPassword&quot;</span>
+<span class="k">val</span> <span class="n">df</span> <span class="k">=</span> <span class="n">sqlContext</span>
+  <span class="o">.</span><span class="n">read</span>
+  <span class="o">.</span><span class="n">format</span><span class="o">(</span><span class="s">&quot;jdbc&quot;</span><span class="o">)</span>
+  <span class="o">.</span><span class="n">option</span><span class="o">(</span><span class="s">&quot;url&quot;</span><span class="o">,</span> <span class="n">url</span><span class="o">)</span>
+  <span class="o">.</span><span class="n">option</span><span class="o">(</span><span class="s">&quot;dbtable&quot;</span><span class="o">,</span> <span class="s">&quot;people&quot;</span><span class="o">)</span>
+  <span class="o">.</span><span class="n">load</span><span class="o">()</span>
+
+<span class="c1">// Looks the schema of this DataFrame.</span>
+<span class="n">df</span><span class="o">.</span><span class="n">printSchema</span><span class="o">()</span>
+
+<span class="c1">// Counts people by age</span>
+<span class="k">val</span> <span class="n">countsByAge</span> <span class="k">=</span> <span class="n">df</span><span class="o">.</span><span class="n">groupBy</span><span class="o">(</span><span class="s">&quot;age&quot;</span><span class="o">).</span><span class="n">count</span><span class="o">()</span>
+<span class="n">countsByAge</span><span class="o">.</span><span class="n">show</span><span class="o">()</span>
+
+<span class="c1">// Saves countsByAge to S3 in the JSON format.</span>
+<span class="n">countsByAge</span><span class="o">.</span><span class="n">write</span><span class="o">.</span><span class="n">format</span><span class="o">(</span><span class="s">&quot;json&quot;</span><span class="o">).</span><span class="n">save</span><span class="o">(</span><span class="s">&quot;s3a://...&quot;</span><span class="o">)</span></code></pre></figure>
+
 </div>
 </div>
 
 <div class="tab-pane tab-pane-java">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="c1">// Creates a DataFrame based on a table named &quot;people&quot;</span>
+<span class="c1">// stored in a MySQL database.</span>
+<span class="n">String</span> <span class="n">url</span> <span class="o">=</span>
+  <span class="s">&quot;jdbc:mysql://yourIP:yourPort/test?user=yourUsername;password=yourPassword&quot;</span><span class="o">;</span>
+<span class="n">DataFrame</span> <span class="n">df</span> <span class="o">=</span> <span class="n">sqlContext</span>
+  <span class="o">.</span><span class="na">read</span><span class="o">()</span>
+  <span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">&quot;jdbc&quot;</span><span class="o">)</span>
+  <span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="s">&quot;url&quot;</span><span class="o">,</span> <span class="n">url</span><span class="o">)</span>
+  <span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="s">&quot;dbtable&quot;</span><span class="o">,</span> <span class="s">&quot;people&quot;</span><span class="o">)</span>
+  <span class="o">.</span><span class="na">load</span><span class="o">();</span>
+
+<span class="c1">// Looks the schema of this DataFrame.</span>
+<span class="n">df</span><span class="o">.</span><span class="na">printSchema</span><span class="o">();</span>
+
+<span class="c1">// Counts people by age</span>
+<span class="n">DataFrame</span> <span class="n">countsByAge</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="na">groupBy</span><span class="o">(</span><span class="s">&quot;age&quot;</span><span class="o">).</span><span class="na">count</span><span class="o">();</span>
+<span class="n">countsByAge</span><span class="o">.</span><span class="na">show</span><span class="o">();</span>
+
+<span class="c1">// Saves countsByAge to S3 in the JSON format.</span>
+<span class="n">countsByAge</span><span class="o">.</span><span class="na">write</span><span class="o">().</span><span class="na">format</span><span class="o">(</span><span class="s">&quot;json&quot;</span><span class="o">).</span><span class="na">save</span><span class="o">(</span><span class="s">&quot;s3a://...&quot;</span><span class="o">);</span></code></pre></figure>
+
 </div>
 </div>
 </div>
@@ -341,19 +502,71 @@ We learn to predict the labels from feat
 <div class="tab-content">
 <div class="tab-pane tab-pane-python active">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="c"># Every record of this DataFrame contains the label and</span>
+<span class="c"># features represented by a vector.</span>
+<span class="n">df</span> <span class="o">=</span> <span class="n">sqlContext</span><span class="o">.</span><span class="n">createDataFrame</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="p">[</span><span class="s">&quot;label&quot;</span><span class="p">,</span> <span class="s">&quot;features&quot;</span><span class="p">])</span>
+
+<span class="c"># Set parameters for the algorithm.</span>
+<span class="c"># Here, we limit the number of iterations to 10.</span>
+<span class="n">lr</span> <span class="o">=</span> <span class="n">LogisticRegression</span><span class="p">(</span><span class="n">maxIter</span><span class="o">=</span><span class="mi">10</span><span class="p">)</span>
+
+<span class="c"># Fit the model to the data.</span>
+<span class="n">model</span> <span class="o">=</span> <span class="n">lr</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
+
+<span class="c"># Given a dataset, predict each point&#39;s label, and show the results.</span>
+<span class="n">model</span><span class="o">.</span><span class="n">transform</span><span class="p">(</span><span class="n">df</span><span class="p">)</span><span class="o">.</span><span class="n">show</span><span class="p">()</span></code></pre></figure>
+
 </div>
 </div>
 
 <div class="tab-pane tab-pane-scala">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="c1">// Every record of this DataFrame contains the label and</span>
+<span class="c1">// features represented by a vector.</span>
+<span class="k">val</span> <span class="n">df</span> <span class="k">=</span> <span class="n">sqlContext</span><span class="o">.</span><span class="n">createDataFrame</span><span class="o">(</span><span class="n">data</span><span class="o">).</span><span class="n">toDF</span><span class="o">(</span><span class="s">&quot;label&quot;</span><span class="o">,</span> <span class="s">&quot;features&quot;</span><span class="o">)</span>
+
+<span class="c1">// Set parameters for the algorithm.</span>
+<span class="c1">// Here, we limit the number of iterations to 10.</span>
+<span class="k">val</span> <span class="n">lr</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">LogisticRegression</span><span class="o">().</span><span class="n">setMaxIter</span><span class="o">(</span><span class="mi">10</span><span class="o">)</span>
+
+<span class="c1">// Fit the model to the data.</span>
+<span class="k">val</span> <span class="n">model</span> <span class="k">=</span> <span class="n">lr</span><span class="o">.</span><span class="n">fit</span><span class="o">(</span><span class="n">df</span><span class="o">)</span>
+
+<span class="c1">// Inspect the model: get the feature weights.</span>
+<span class="k">val</span> <span class="n">weights</span> <span class="k">=</span> <span class="n">model</span><span class="o">.</span><span class="n">weights</span>
+
+<span class="c1">// Given a dataset, predict each point&#39;s label, and show the results.</span>
+<span class="n">model</span><span class="o">.</span><span class="n">transform</span><span class="o">(</span><span class="n">df</span><span class="o">).</span><span class="n">show</span><span class="o">()</span></code></pre></figure>
+
 </div>
 </div>
 
 <div class="tab-pane tab-pane-java">
 <div class="code code-tab">
-Liquid error: pygments
+
+<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="c1">// Every record of this DataFrame contains the label and</span>
+<span class="c1">// features represented by a vector.</span>
+<span class="n">StructType</span> <span class="n">schema</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">StructType</span><span class="o">(</span><span class="k">new</span> <span class="n">StructField</span><span class="o">[]{</span>
+  <span class="k">new</span> <span class="nf">StructField</span><span class="o">(</span><span class="s">&quot;label&quot;</span><span class="o">,</span> <span class="n">DataTypes</span><span class="o">.</span><span class="na">DoubleType</span><span class="o">,</span> <span class="kc">false</span><span class="o">,</span> <span class="n">Metadata</span><span class="o">.</span><span class="na">empty</span><span class="o">()),</span>
+  <span class="k">new</span> <span class="nf">StructField</span><span class="o">(</span><span class="s">&quot;features&quot;</span><span class="o">,</span> <span class="k">new</span> <span class="nf">VectorUDT</span><span class="o">(),</span> <span class="kc">false</span><span class="o">,</span> <span class="n">Metadata</span><span class="o">.</span><span class="na">empty</span><span class="o">()),</span>
+<span class="o">});</span>
+<span class="n">DataFrame</span> <span class="n">df</span> <span class="o">=</span> <span class="n">jsql</span><span class="o">.</span><span class="na">createDataFrame</span><span class="o">(</span><span class="n">data</span><span class="o">,</span> <span class="n">schema</span><span class="o">);</span>
+
+<span class="c1">// Set parameters for the algorithm.</span>
+<span class="c1">// Here, we limit the number of iterations to 10.</span>
+<span class="n">LogisticRegression</span> <span class="n">lr</span> <span class="o">=</span> <span class="k">new</span> <span class="nf">LogisticRegression</span><span class="o">().</span><span class="na">setMaxIter</span><span class="o">(</span><span class="mi">10</span><span class="o">);</span>
+
+<span class="c1">// Fit the model to the data.</span>
+<span class="n">LogisticRegressionModel</span> <span class="n">model</span> <span class="o">=</span> <span class="n">lr</span><span class="o">.</span><span class="na">fit</span><span class="o">(</span><span class="n">df</span><span class="o">);</span>
+
+<span class="c1">// Inspect the model: get the feature weights.</span>
+<span class="n">Vector</span> <span class="n">weights</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="na">weights</span><span class="o">();</span>
+
+<span class="c1">// Given a dataset, predict each point&#39;s label, and show the results.</span>
+<span class="n">model</span><span class="o">.</span><span class="na">transform</span><span class="o">(</span><span class="n">df</span><span class="o">).</span><span class="na">show</span><span class="o">();</span></code></pre></figure>
+
 </div>
 </div>
 </div>




---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message