beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ieme...@apache.org
Subject [2/3] beam-site git commit: Regenerate website
Date Tue, 06 Jun 2017 07:38:48 GMT
Regenerate website


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/f8d9fc15
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/f8d9fc15
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/f8d9fc15

Branch: refs/heads/asf-site
Commit: f8d9fc15b9c7c14f5229e529f8d2813a1c462adc
Parents: 8ff65fe
Author: Ismaël Mejía <iemejia@apache.org>
Authored: Tue Jun 6 09:33:30 2017 +0200
Committer: Ismaël Mejía <iemejia@apache.org>
Committed: Tue Jun 6 09:33:30 2017 +0200

----------------------------------------------------------------------
 .../documentation/io/built-in/hadoop/index.html | 31 ++++++++++++++++++++
 1 file changed, 31 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/beam-site/blob/f8d9fc15/content/documentation/io/built-in/hadoop/index.html
----------------------------------------------------------------------
diff --git a/content/documentation/io/built-in/hadoop/index.html b/content/documentation/io/built-in/hadoop/index.html
index ed7ee0b..e323674 100644
--- a/content/documentation/io/built-in/hadoop/index.html
+++ b/content/documentation/io/built-in/hadoop/index.html
@@ -330,6 +330,37 @@
 
 <p>The <code class="highlighter-rouge">org.elasticsearch.hadoop.mr.EsInputFormat</code>’s
<code class="highlighter-rouge">EsInputFormat</code> key class is <code class="highlighter-rouge">org.apache.hadoop.io.Text</code>
<code class="highlighter-rouge">Text</code>, and its value class is <code class="highlighter-rouge">org.elasticsearch.hadoop.mr.LinkedMapWritable</code>
<code class="highlighter-rouge">LinkedMapWritable</code>. Both key and value classes
have Beam Coders.</p>
 
+<h3 id="hcatalog---hcatinputformat">HCatalog - HCatInputFormat</h3>
+
+<p>To read data using HCatalog, use <code class="highlighter-rouge">org.apache.hive.hcatalog.mapreduce.HCatInputFormat</code>,
which needs the following properties to be set:</p>
+
+<div class="language-java highlighter-rouge"><pre class="highlight"><code><span
class="n">Configuration</span> <span class="n">hcatConf</span> <span
class="o">=</span> <span class="k">new</span> <span class="n">Configuration</span><span
class="o">();</span>
+<span class="n">hcatConf</span><span class="o">.</span><span class="na">setClass</span><span
class="o">(</span><span class="s">"mapreduce.job.inputformat.class"</span><span
class="o">,</span> <span class="n">HCatInputFormat</span><span class="o">.</span><span
class="na">class</span><span class="o">,</span> <span class="n">InputFormat</span><span
class="o">.</span><span class="na">class</span><span class="o">);</span>
+<span class="n">hcatConf</span><span class="o">.</span><span class="na">setClass</span><span
class="o">(</span><span class="s">"key.class"</span><span class="o">,</span>
<span class="n">LongWritable</span><span class="o">.</span><span
class="na">class</span><span class="o">,</span> <span class="n">Object</span><span
class="o">.</span><span class="na">class</span><span class="o">);</span>
+<span class="n">hcatConf</span><span class="o">.</span><span class="na">setClass</span><span
class="o">(</span><span class="s">"value.class"</span><span class="o">,</span>
<span class="n">HCatRecord</span><span class="o">.</span><span
class="na">class</span><span class="o">,</span> <span class="n">Object</span><span
class="o">.</span><span class="na">class</span><span class="o">);</span>
+<span class="n">hcatConf</span><span class="o">.</span><span class="na">set</span><span
class="o">(</span><span class="s">"hive.metastore.uris"</span><span
class="o">,</span> <span class="s">"thrift://metastore-host:port"</span><span
class="o">);</span>
+
+<span class="n">org</span><span class="o">.</span><span class="na">apache</span><span
class="o">.</span><span class="na">hive</span><span class="o">.</span><span
class="na">hcatalog</span><span class="o">.</span><span class="na">mapreduce</span><span
class="o">.</span><span class="na">HCatInputFormat</span><span class="o">.</span><span
class="na">setInput</span><span class="o">(</span><span class="n">hcatConf</span><span
class="o">,</span> <span class="s">"my_database"</span><span class="o">,</span>
<span class="s">"my_table"</span><span class="o">,</span> <span
class="s">"my_filter"</span><span class="o">);</span>
+</code></pre>
+</div>
+
+<div class="language-py highlighter-rouge"><pre class="highlight"><code>
 <span class="c"># The Beam SDK for Python does not support Hadoop InputFormat IO.</span>
+</code></pre>
+</div>
+
+<p>Call Read transform as follows:</p>
+
+<div class="language-java highlighter-rouge"><pre class="highlight"><code><span
class="n">PCollection</span><span class="o">&lt;</span><span class="n">KV</span><span
class="o">&lt;</span><span class="n">Long</span><span class="o">,</span>
<span class="n">HCatRecord</span><span class="o">&gt;&gt;</span>
<span class="n">hcatData</span> <span class="o">=</span>
+  <span class="n">p</span><span class="o">.</span><span class="na">apply</span><span
class="o">(</span><span class="s">"read"</span><span class="o">,</span>
+  <span class="n">HadoopInputFormatIO</span><span class="o">.&lt;</span><span
class="n">Long</span><span class="o">,</span> <span class="n">HCatRecord</span><span
class="o">&gt;</span><span class="n">read</span><span class="o">()</span>
+  <span class="o">.</span><span class="na">withConfiguration</span><span
class="o">(</span><span class="n">hcatConf</span><span class="o">);</span>
+</code></pre>
+</div>
+
+<div class="language-py highlighter-rouge"><pre class="highlight"><code>
 <span class="c"># The Beam SDK for Python does not support Hadoop InputFormat IO.</span>
+</code></pre>
+</div>
+
     </div>
     <footer class="footer">
   <div class="footer__contained">


Mime
View raw message