crunch-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r927240 - in /websites/staging/crunch/trunk/content: ./ user-guide.html
Date Wed, 29 Oct 2014 03:42:33 GMT
Author: buildbot
Date: Wed Oct 29 03:42:33 2014
New Revision: 927240

Log:
Staging update by buildbot for crunch

Modified:
    websites/staging/crunch/trunk/content/   (props changed)
    websites/staging/crunch/trunk/content/user-guide.html

Propchange: websites/staging/crunch/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Wed Oct 29 03:42:33 2014
@@ -1 +1 @@
-1635034
+1635035

Modified: websites/staging/crunch/trunk/content/user-guide.html
==============================================================================
--- websites/staging/crunch/trunk/content/user-guide.html (original)
+++ websites/staging/crunch/trunk/content/user-guide.html Wed Oct 29 03:42:33 2014
@@ -649,30 +649,29 @@ includes both Avro generic and specific 
 
 
 <p>The <a href="apidocs/0.10.0/org/apache/crunch/types/avro/Avros.html">Avros</a>
class also has a <code>reflects</code> method for creating PTypes
-for POJOs using Avro's reflection-based serialization mechanism. There are a couple of restrictions
on the structure of
-the POJO:</p>
-<ol>
-<li>It must have a default, no-arg constructor.</li>
-<li>
-<p>All of its fields must be Avro primitive types or collection types that have Avro
equivalents, like <code>ArrayList</code> and
-<code>HashMap&lt;String, T&gt;</code>. You may also have arrays of Avro
primitive types.</p>
-<p>// Declare an inline data type and use it for Crunch serialization
-public static class UrlData {
-  // The fields don't have to be public, just doing this for the example.
-  double curPageRank;
-  String[] outboundUrls;</p>
-<p>// Remember: you must have a no-arg constructor. 
-  public UrlData() { this(0.0, new String[0]); }</p>
-<p>// The regular constructor
-  public UrlData(double pageRank, String[] outboundUrls) {
-    this.curPageRank = pageRank;
-    this.outboundUrls = outboundUrls;
-  }
-}</p>
-<p>PType<UrlData> urlDataType = Avros.reflects(UrlData.class);
-PTableType<String, UrlData> pageRankType = Avros.tableOf(Avros.strings(), urlDataType);</p>
-</li>
-</ol>
+for POJOs using Avro's reflection-based serialization mechanism. There are a couple of restrictions
on the structure of the POJO. First, it must have a default, no-arg constructor. Second, all
of its fields must be Avro primitive types or collection types that have Avro equivalents,
like <code>ArrayList</code> and <code>HAshMap&lt;String, T&gt;</code>.
You may also have arrays of Avro primitive types.
+the POJO.</p>
+<div class="codehilite"><pre><span class="c1">// Declare an inline data
type and use it for Crunch serialization</span>
+<span class="n">public</span> <span class="k">static</span> <span
class="k">class</span> <span class="n">UrlData</span> <span class="p">{</span>
+  <span class="c1">// The fields don&#39;t have to be public, just doing this for
the example.</span>
+  <span class="n">double</span> <span class="n">curPageRank</span><span
class="p">;</span>
+  <span class="n">String</span><span class="p">[]</span> <span
class="n">outboundUrls</span><span class="p">;</span>
+
+  <span class="c1">// Remember: you must have a no-arg constructor. </span>
+  <span class="n">public</span> <span class="n">UrlData</span><span
class="p">()</span> <span class="p">{</span> <span class="k">this</span><span
class="p">(</span><span class="mf">0.0</span><span class="p">,</span>
<span class="k">new</span> <span class="n">String</span><span class="p">[</span><span
class="mh">0</span><span class="p">]);</span> <span class="p">}</span>
+
+  <span class="c1">// The regular constructor</span>
+  <span class="n">public</span> <span class="n">UrlData</span><span
class="p">(</span><span class="n">double</span> <span class="n">pageRank</span><span
class="p">,</span> <span class="n">String</span><span class="p">[]</span>
<span class="n">outboundUrls</span><span class="p">)</span> <span
class="p">{</span>
+    <span class="k">this</span><span class="p">.</span><span class="n">curPageRank</span>
<span class="o">=</span> <span class="n">pageRank</span><span class="p">;</span>
+    <span class="k">this</span><span class="p">.</span><span class="n">outboundUrls</span>
<span class="o">=</span> <span class="n">outboundUrls</span><span
class="p">;</span>
+  <span class="p">}</span>
+<span class="p">}</span>
+
+<span class="n">PType</span><span class="o">&lt;</span><span
class="n">UrlData</span><span class="o">&gt;</span> <span class="n">urlDataType</span>
<span class="o">=</span> <span class="n">Avros</span><span class="p">.</span><span
class="n">reflects</span><span class="p">(</span><span class="n">UrlData</span><span
class="p">.</span><span class="k">class</span><span class="p">);</span>
+<span class="n">PTableType</span><span class="o">&lt;</span><span
class="n">String</span><span class="p">,</span> <span class="n">UrlData</span><span
class="o">&gt;</span> <span class="n">pageRankType</span> <span
class="o">=</span> <span class="n">Avros</span><span class="p">.</span><span
class="n">tableOf</span><span class="p">(</span><span class="n">Avros</span><span
class="p">.</span><span class="n">strings</span><span class="p">(),</span>
<span class="n">urlDataType</span><span class="p">);</span>
+</pre></div>
+
+
 <p>Avro reflection is a great way to define intermediate types for your Crunch pipelines;
not only is your logic clear
 and easy to test, but the fact that the data is written out as Avro records means that you
can use tools like Hive and Pig
 to query intermediate results to aid in debugging pipeline failures.</p>



Mime
View raw message