arrow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From w...@apache.org
Subject [3/3] arrow-site git commit: Build website for 0.7.0
Date Tue, 19 Sep 2017 13:11:54 GMT
Build website for 0.7.0


Project: http://git-wip-us.apache.org/repos/asf/arrow-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/arrow-site/commit/29105b5e
Tree: http://git-wip-us.apache.org/repos/asf/arrow-site/tree/29105b5e
Diff: http://git-wip-us.apache.org/repos/asf/arrow-site/diff/29105b5e

Branch: refs/heads/asf-site
Commit: 29105b5e17a396e96d0c6434e307d008ed3416c3
Parents: 5460ea7
Author: Wes McKinney <wes.mckinney@twosigma.com>
Authored: Tue Sep 19 09:11:46 2017 -0400
Committer: Wes McKinney <wes.mckinney@twosigma.com>
Committed: Tue Sep 19 09:11:46 2017 -0400

----------------------------------------------------------------------
 blog/2017/09/19/0.7.0-release/index.html | 298 ++++++++++++++++++++++++++
 blog/index.html                          | 191 +++++++++++++++++
 css/main.css                             |   2 +-
 docs/ipc.html                            |  37 ++--
 docs/memory_layout.html                  |  55 +++--
 docs/metadata.html                       |  27 ++-
 feed.xml                                 | 161 +++++++++++++-
 index.html                               |  27 ++-
 install/index.html                       |  37 +++-
 release/index.html                       |   1 +
 10 files changed, 770 insertions(+), 66 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/arrow-site/blob/29105b5e/blog/2017/09/19/0.7.0-release/index.html
----------------------------------------------------------------------
diff --git a/blog/2017/09/19/0.7.0-release/index.html b/blog/2017/09/19/0.7.0-release/index.html
new file mode 100644
index 0000000..e208aec
--- /dev/null
+++ b/blog/2017/09/19/0.7.0-release/index.html
@@ -0,0 +1,298 @@
+<!DOCTYPE html>
+<html lang="en-US">
+  <head>
+    <meta charset="UTF-8">
+    <title>Apache Arrow Homepage</title>
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="generator" content="Jekyll v3.4.3">
+    <!-- The above 3 meta tags *must* come first in the head; any other head content must
come *after* these tags -->
+    <link rel="icon" type="image/x-icon" href="/favicon.ico">
+
+    <link rel="stylesheet" href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900">
+
+    <link href="/css/main.css" rel="stylesheet">
+    <link href="/css/syntax.css" rel="stylesheet">
+    <script src="https://code.jquery.com/jquery-3.2.1.min.js"
+            integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4="
+            crossorigin="anonymous"></script>
+    <script src="/assets/javascripts/bootstrap.min.js"></script>
+  </head>
+
+
+
+<body class="wrap">
+  <div class="container">
+    <nav class="navbar navbar-default">
+  <div class="container-fluid">
+    <div class="navbar-header">
+      <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#arrow-navbar">
+        <span class="sr-only">Toggle navigation</span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+      </button>
+      <a class="navbar-brand" href="/">Apache Arrow&#8482;&nbsp;&nbsp;&nbsp;</a>
+    </div>
+
+    <!-- Collect the nav links, forms, and other content for toggling -->
+    <div class="collapse navbar-collapse" id="arrow-navbar">
+      <ul class="nav navbar-nav">
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Project Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/install/">Install</a></li>
+            <li><a href="/blog/">Blog</a></li>
+            <li><a href="/release/">Releases</a></li>
+            <li><a href="https://issues.apache.org/jira/browse/ARROW">Issue Tracker</a></li>
+            <li><a href="https://github.com/apache/arrow">Source Code</a></li>
+            <li><a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">Mailing
List</a></li>
+            <li><a href="https://apachearrowslackin.herokuapp.com">Slack Channel</a></li>
+            <li><a href="/committers/">Committers</a></li>
+          </ul>
+        </li>
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Specification<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/memory_layout.html">Memory Layout</a></li>
+            <li><a href="/docs/metadata.html">Metadata</a></li>
+            <li><a href="/docs/ipc.html">Messaging / IPC</a></li>
+          </ul>
+        </li>
+
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Documentation<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/python">Python</a></li>
+            <li><a href="/docs/cpp">C++ API</a></li>
+            <li><a href="/docs/java">Java API</a></li>
+            <li><a href="/docs/c_glib">C GLib API</a></li>
+          </ul>
+        </li>
+        <!-- <li><a href="/blog">Blog</a></li> -->
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">ASF Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="http://www.apache.org/">ASF Website</a></li>
+            <li><a href="http://www.apache.org/licenses/">License</a></li>
+            <li><a href="http://www.apache.org/foundation/sponsorship.html">Donate</a></li>
+            <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+            <li><a href="http://www.apache.org/security/">Security</a></li>
+          </ul>
+        </li>
+      </ul>
+      <a href="http://www.apache.org/">
+        <img style="float:right;" src="/img/asf_logo.svg" width="120px"/>
+      </a>
+      </div><!-- /.navbar-collapse -->
+    </div>
+  </nav>
+
+
+    <h2>
+      Apache Arrow 0.7.0 Release
+      <a href="/blog/2017/09/19/0.7.0-release/" class="permalink" title="Permalink">∞</a>
+    </h2>
+
+    
+
+    <div class="panel">
+      <div class="panel-body">
+        <div>
+          <span class="label label-default">Published</span>
+          <span class="published">
+            <i class="fa fa-calendar"></i>
+            19 Sep 2017
+          </span>
+        </div>
+        <div>
+          <span class="label label-default">By</span>
+          <a href="http://wesmckinney.com"><i class="fa fa-user"></i> Wes
McKinney (wesm)</a>
+        </div>
+      </div>
+    </div>
+
+    <!--
+
+-->
+
+<p>The Apache Arrow team is pleased to announce the 0.7.0 release. It includes
+<a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.7.0"><strong>133
resolved JIRAs</strong></a> many new features and bug fixes to the various
+language implementations. The Arrow memory format remains stable since the
+0.3.x release.</p>
+
+<p>See the <a href="http://arrow.apache.org/install">Install Page</a> to
learn how to get the libraries for your
+platform. The <a href="http://arrow.apache.org/release/0.7.0.html">complete changelog</a>
is also available.</p>
+
+<p>We include some highlights from the release in this post.</p>
+
+<h2 id="new-pmc-member-kouhei-sutou">New PMC Member: Kouhei Sutou</h2>
+
+<p>Since the last release we have added <a href="https://github.com/kou">Kou</a>
to the Arrow Project Management
+Committee. He is also a PMC for Apache Subversion, and a major contributor to
+many other open source projects.</p>
+
+<p>As an active member of the Ruby community in Japan, Kou has been developing the
+GLib-based C bindings for Arrow with associated Ruby wrappers, to enable Ruby
+users to benefit from the work that’s happening in Apache Arrow.</p>
+
+<p>We are excited to be collaborating with the Ruby community on shared
+infrastructure for in-memory analytics and data science.</p>
+
+<h2 id="expanded-javascript-typescript-implementation">Expanded JavaScript (TypeScript)
Implementation</h2>
+
+<p><a href="https://github.com/trxcllnt">Paul Taylor</a> from the <a
href="https://github.com/netflix/falcor">Falcor</a> and <a href="http://reactivex.io">ReactiveX</a>
projects has worked to
+expand the JavaScript implementation (which is written in TypeScript), using
+the latest in modern JavaScript build and packaging technology. We are looking
+forward to building out the JS implementation and bringing it up to full
+functionality with the C++ and Java implementations.</p>
+
+<p>We are looking for more JavaScript developers to join the project and work
+together to make Arrow for JS work well with many kinds of front end use cases,
+like real time data visualization.</p>
+
+<h2 id="type-casting-for-c-and-python">Type casting for C++ and Python</h2>
+
+<p>As part of longer-term efforts to build an Arrow-native in-memory analytics
+library, we implemented a variety of type conversion functions. These functions
+are essential in ETL tasks when conforming one table schema to another. These
+are similar to the <code class="highlighter-rouge">astype</code> function in
NumPy.</p>
+
+<div class="language-python highlighter-rouge"><pre class="highlight"><code><span
class="n">In</span> <span class="p">[</span><span class="mi">17</span><span
class="p">]:</span> <span class="kn">import</span> <span class="nn">pyarrow</span>
<span class="kn">as</span> <span class="nn">pa</span>
+
+<span class="n">In</span> <span class="p">[</span><span class="mi">18</span><span
class="p">]:</span> <span class="n">arr</span> <span class="o">=</span>
<span class="n">pa</span><span class="o">.</span><span class="n">array</span><span
class="p">([</span><span class="bp">True</span><span class="p">,</span>
<span class="bp">False</span><span class="p">,</span> <span class="bp">None</span><span
class="p">,</span> <span class="bp">True</span><span class="p">])</span>
+
+<span class="n">In</span> <span class="p">[</span><span class="mi">19</span><span
class="p">]:</span> <span class="n">arr</span>
+<span class="n">Out</span><span class="p">[</span><span class="mi">19</span><span
class="p">]:</span>
+<span class="o">&lt;</span><span class="n">pyarrow</span><span
class="o">.</span><span class="n">lib</span><span class="o">.</span><span
class="n">BooleanArray</span> <span class="nb">object</span> <span
class="n">at</span> <span class="mh">0x7ff6fb069b88</span><span class="o">&gt;</span>
+<span class="p">[</span>
+  <span class="bp">True</span><span class="p">,</span>
+  <span class="bp">False</span><span class="p">,</span>
+  <span class="n">NA</span><span class="p">,</span>
+  <span class="bp">True</span>
+<span class="p">]</span>
+
+<span class="n">In</span> <span class="p">[</span><span class="mi">20</span><span
class="p">]:</span> <span class="n">arr</span><span class="o">.</span><span
class="n">cast</span><span class="p">(</span><span class="n">pa</span><span
class="o">.</span><span class="n">int32</span><span class="p">())</span>
+<span class="n">Out</span><span class="p">[</span><span class="mi">20</span><span
class="p">]:</span>
+<span class="o">&lt;</span><span class="n">pyarrow</span><span
class="o">.</span><span class="n">lib</span><span class="o">.</span><span
class="n">Int32Array</span> <span class="nb">object</span> <span class="n">at</span>
<span class="mh">0x7ff6fb0383b8</span><span class="o">&gt;</span>
+<span class="p">[</span>
+  <span class="mi">1</span><span class="p">,</span>
+  <span class="mi">0</span><span class="p">,</span>
+  <span class="n">NA</span><span class="p">,</span>
+  <span class="mi">1</span>
+<span class="p">]</span>
+</code></pre>
+</div>
+
+<p>Over time these will expand to support as many input-and-output type
+combinations with optimized conversions.</p>
+
+<h2 id="new-arrow-gpu-cuda-extension-library-for-c">New Arrow GPU (CUDA) Extension
Library for C++</h2>
+
+<p>To help with GPU-related projects using Arrow, like the <a href="http://gpuopenanalytics.com/">GPU
Open Analytics
+Initiative</a>, we have started a C++ add-on library to simplify Arrow memory
+management on CUDA-enabled graphics cards. We would like to expand this to
+include a library of reusable CUDA kernel functions for GPU analytics on Arrow
+columnar memory.</p>
+
+<p>For example, we could write a record batch from CPU memory to GPU device memory
+like so (some error checking omitted):</p>
+
+<div class="language-c++ highlighter-rouge"><pre class="highlight"><code><span
class="cp">#include &lt;arrow/api.h&gt;
+#include &lt;arrow/gpu/cuda_api.h&gt;
+</span>
+<span class="k">using</span> <span class="k">namespace</span> <span
class="n">arrow</span><span class="p">;</span>
+
+<span class="n">gpu</span><span class="o">::</span><span class="n">CudaDeviceManager</span><span
class="o">*</span> <span class="n">manager</span><span class="p">;</span>
+<span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span
class="o">&lt;</span><span class="n">gpu</span><span class="o">::</span><span
class="n">CudaContext</span><span class="o">&gt;</span> <span
class="n">context</span><span class="p">;</span>
+
+<span class="n">gpu</span><span class="o">::</span><span class="n">CudaDeviceManager</span><span
class="o">::</span><span class="n">GetInstance</span><span class="p">(</span><span
class="o">&amp;</span><span class="n">manager</span><span class="p">)</span>
+<span class="n">manager_</span><span class="o">-&gt;</span><span
class="n">GetContext</span><span class="p">(</span><span class="n">kGpuNumber</span><span
class="p">,</span> <span class="o">&amp;</span><span class="n">context</span><span
class="p">);</span>
+
+<span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span
class="o">&lt;</span><span class="n">RecordBatch</span><span class="o">&gt;</span>
<span class="n">batch</span> <span class="o">=</span> <span class="n">GetCpuData</span><span
class="p">();</span>
+
+<span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span
class="o">&lt;</span><span class="n">gpu</span><span class="o">::</span><span
class="n">CudaBuffer</span><span class="o">&gt;</span> <span class="n">device_serialized</span><span
class="p">;</span>
+<span class="n">gpu</span><span class="o">::</span><span class="n">SerializeRecordBatch</span><span
class="p">(</span><span class="o">*</span><span class="n">batch</span><span
class="p">,</span> <span class="n">context_</span><span class="p">.</span><span
class="n">get</span><span class="p">(),</span> <span class="o">&amp;</span><span
class="n">device_serialized</span><span class="p">));</span>
+</code></pre>
+</div>
+
+<p>We can then “read” the GPU record batch, but the returned <code class="highlighter-rouge">arrow::RecordBatch</code>
+internally will contain GPU device pointers that you can use for CUDA kernel
+calls:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>std::shared_ptr&lt;RecordBatch&gt;
device_batch;
+gpu::ReadRecordBatch(batch-&gt;schema(), device_serialized,
+                     default_memory_pool(), &amp;device_batch));
+
+// Now run some CUDA kernels on device_batch
+</code></pre>
+</div>
+
+<h2 id="decimal-integration-tests">Decimal Integration Tests</h2>
+
+<p><a href="http://github.com/cpcloud">Phillip Cloud</a> has been working
on decimal support in C++ to enable Parquet
+read/write support in C++ and Python, and also end-to-end testing against the
+Arrow Java libraries.</p>
+
+<p>In the upcoming releases, we hope to complete the remaining data types that
+need end-to-end testing between Java and C++:</p>
+
+<ul>
+  <li>Fixed size lists (variable-size lists already implemented)</li>
+  <li>Fixes size binary</li>
+  <li>Unions</li>
+  <li>Maps</li>
+  <li>Time intervals</li>
+</ul>
+
+<h2 id="other-notable-python-changes">Other Notable Python Changes</h2>
+
+<p>Some highlights of Python development outside of bug fixes and general API
+improvements include:</p>
+
+<ul>
+  <li>Simplified <code class="highlighter-rouge">put</code> and <code
class="highlighter-rouge">get</code> arbitrary Python objects in Plasma objects</li>
+  <li><a href="http://arrow.apache.org/docs/python/ipc.html">High-speed, memory
efficient object serialization</a>. This is important
+enough that we will likely write a dedicated blog post about it.</li>
+  <li>New <code class="highlighter-rouge">flavor='spark'</code> option
to <code class="highlighter-rouge">pyarrow.parquet.write_table</code> to enable
easy
+writing of Parquet files maximized for Spark compatibility</li>
+  <li><code class="highlighter-rouge">parquet.write_to_dataset</code> function
with support for partitioned writes</li>
+  <li>Improved support for Dask filesystems</li>
+  <li>Improved Python usability for IPC: read and write schemas and record batches
+more easily. See the <a href="http://arrow.apache.org/docs/python/api.html">API docs</a>
for more about these.</li>
+</ul>
+
+<h2 id="the-road-ahead">The Road Ahead</h2>
+
+<p>Upcoming Arrow releases will continue to expand the project to cover more use
+cases. In addition to completing end-to-end testing for all the major data
+types, some of us will be shifting attention to building Arrow-native in-memory
+analytics libraries.</p>
+
+<p>We are looking for more JavaScript, R, and other programming language
+developers to join the project and expand the available implementations and
+bindings to more languages.</p>
+
+
+
+    <hr/>
+<footer class="footer">
+  <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project
logo are either registered trademarks or trademarks of The Apache Software Foundation in the
United States and other countries.</p>
+  <p>&copy; 2017 Apache Software Foundation</p>
+</footer>
+
+  </div>
+</body>
+</html>

http://git-wip-us.apache.org/repos/asf/arrow-site/blob/29105b5e/blog/index.html
----------------------------------------------------------------------
diff --git a/blog/index.html b/blog/index.html
index 7734649..27f27b0 100644
--- a/blog/index.html
+++ b/blog/index.html
@@ -110,6 +110,197 @@
     
   <div class="container">
     <h2>
+      Apache Arrow 0.7.0 Release
+      <a href="/blog/2017/09/19/0.7.0-release/" class="permalink" title="Permalink">∞</a>
+    </h2>
+
+    
+
+    <div class="panel">
+      <div class="panel-body">
+        <div>
+          <span class="label label-default">Published</span>
+          <span class="published">
+            <i class="fa fa-calendar"></i>
+            19 Sep 2017
+          </span>
+        </div>
+        <div>
+          <span class="label label-default">By</span>
+          <a href="http://wesmckinney.com"><i class="fa fa-user"></i> Wes
McKinney (wesm)</a>
+        </div>
+      </div>
+    </div>
+    <!--
+
+-->
+
+<p>The Apache Arrow team is pleased to announce the 0.7.0 release. It includes
+<a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.7.0"><strong>133
resolved JIRAs</strong></a> many new features and bug fixes to the various
+language implementations. The Arrow memory format remains stable since the
+0.3.x release.</p>
+
+<p>See the <a href="http://arrow.apache.org/install">Install Page</a> to
learn how to get the libraries for your
+platform. The <a href="http://arrow.apache.org/release/0.7.0.html">complete changelog</a>
is also available.</p>
+
+<p>We include some highlights from the release in this post.</p>
+
+<h2 id="new-pmc-member-kouhei-sutou">New PMC Member: Kouhei Sutou</h2>
+
+<p>Since the last release we have added <a href="https://github.com/kou">Kou</a>
to the Arrow Project Management
+Committee. He is also a PMC for Apache Subversion, and a major contributor to
+many other open source projects.</p>
+
+<p>As an active member of the Ruby community in Japan, Kou has been developing the
+GLib-based C bindings for Arrow with associated Ruby wrappers, to enable Ruby
+users to benefit from the work that’s happening in Apache Arrow.</p>
+
+<p>We are excited to be collaborating with the Ruby community on shared
+infrastructure for in-memory analytics and data science.</p>
+
+<h2 id="expanded-javascript-typescript-implementation">Expanded JavaScript (TypeScript)
Implementation</h2>
+
+<p><a href="https://github.com/trxcllnt">Paul Taylor</a> from the <a
href="https://github.com/netflix/falcor">Falcor</a> and <a href="http://reactivex.io">ReactiveX</a>
projects has worked to
+expand the JavaScript implementation (which is written in TypeScript), using
+the latest in modern JavaScript build and packaging technology. We are looking
+forward to building out the JS implementation and bringing it up to full
+functionality with the C++ and Java implementations.</p>
+
+<p>We are looking for more JavaScript developers to join the project and work
+together to make Arrow for JS work well with many kinds of front end use cases,
+like real time data visualization.</p>
+
+<h2 id="type-casting-for-c-and-python">Type casting for C++ and Python</h2>
+
+<p>As part of longer-term efforts to build an Arrow-native in-memory analytics
+library, we implemented a variety of type conversion functions. These functions
+are essential in ETL tasks when conforming one table schema to another. These
+are similar to the <code class="highlighter-rouge">astype</code> function in
NumPy.</p>
+
+<div class="language-python highlighter-rouge"><pre class="highlight"><code><span
class="n">In</span> <span class="p">[</span><span class="mi">17</span><span
class="p">]:</span> <span class="kn">import</span> <span class="nn">pyarrow</span>
<span class="kn">as</span> <span class="nn">pa</span>
+
+<span class="n">In</span> <span class="p">[</span><span class="mi">18</span><span
class="p">]:</span> <span class="n">arr</span> <span class="o">=</span>
<span class="n">pa</span><span class="o">.</span><span class="n">array</span><span
class="p">([</span><span class="bp">True</span><span class="p">,</span>
<span class="bp">False</span><span class="p">,</span> <span class="bp">None</span><span
class="p">,</span> <span class="bp">True</span><span class="p">])</span>
+
+<span class="n">In</span> <span class="p">[</span><span class="mi">19</span><span
class="p">]:</span> <span class="n">arr</span>
+<span class="n">Out</span><span class="p">[</span><span class="mi">19</span><span
class="p">]:</span>
+<span class="o">&lt;</span><span class="n">pyarrow</span><span
class="o">.</span><span class="n">lib</span><span class="o">.</span><span
class="n">BooleanArray</span> <span class="nb">object</span> <span
class="n">at</span> <span class="mh">0x7ff6fb069b88</span><span class="o">&gt;</span>
+<span class="p">[</span>
+  <span class="bp">True</span><span class="p">,</span>
+  <span class="bp">False</span><span class="p">,</span>
+  <span class="n">NA</span><span class="p">,</span>
+  <span class="bp">True</span>
+<span class="p">]</span>
+
+<span class="n">In</span> <span class="p">[</span><span class="mi">20</span><span
class="p">]:</span> <span class="n">arr</span><span class="o">.</span><span
class="n">cast</span><span class="p">(</span><span class="n">pa</span><span
class="o">.</span><span class="n">int32</span><span class="p">())</span>
+<span class="n">Out</span><span class="p">[</span><span class="mi">20</span><span
class="p">]:</span>
+<span class="o">&lt;</span><span class="n">pyarrow</span><span
class="o">.</span><span class="n">lib</span><span class="o">.</span><span
class="n">Int32Array</span> <span class="nb">object</span> <span class="n">at</span>
<span class="mh">0x7ff6fb0383b8</span><span class="o">&gt;</span>
+<span class="p">[</span>
+  <span class="mi">1</span><span class="p">,</span>
+  <span class="mi">0</span><span class="p">,</span>
+  <span class="n">NA</span><span class="p">,</span>
+  <span class="mi">1</span>
+<span class="p">]</span>
+</code></pre>
+</div>
+
+<p>Over time these will expand to support as many input-and-output type
+combinations with optimized conversions.</p>
+
+<h2 id="new-arrow-gpu-cuda-extension-library-for-c">New Arrow GPU (CUDA) Extension
Library for C++</h2>
+
+<p>To help with GPU-related projects using Arrow, like the <a href="http://gpuopenanalytics.com/">GPU
Open Analytics
+Initiative</a>, we have started a C++ add-on library to simplify Arrow memory
+management on CUDA-enabled graphics cards. We would like to expand this to
+include a library of reusable CUDA kernel functions for GPU analytics on Arrow
+columnar memory.</p>
+
+<p>For example, we could write a record batch from CPU memory to GPU device memory
+like so (some error checking omitted):</p>
+
+<div class="language-c++ highlighter-rouge"><pre class="highlight"><code><span
class="cp">#include &lt;arrow/api.h&gt;
+#include &lt;arrow/gpu/cuda_api.h&gt;
+</span>
+<span class="k">using</span> <span class="k">namespace</span> <span
class="n">arrow</span><span class="p">;</span>
+
+<span class="n">gpu</span><span class="o">::</span><span class="n">CudaDeviceManager</span><span
class="o">*</span> <span class="n">manager</span><span class="p">;</span>
+<span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span
class="o">&lt;</span><span class="n">gpu</span><span class="o">::</span><span
class="n">CudaContext</span><span class="o">&gt;</span> <span
class="n">context</span><span class="p">;</span>
+
+<span class="n">gpu</span><span class="o">::</span><span class="n">CudaDeviceManager</span><span
class="o">::</span><span class="n">GetInstance</span><span class="p">(</span><span
class="o">&amp;</span><span class="n">manager</span><span class="p">)</span>
+<span class="n">manager_</span><span class="o">-&gt;</span><span
class="n">GetContext</span><span class="p">(</span><span class="n">kGpuNumber</span><span
class="p">,</span> <span class="o">&amp;</span><span class="n">context</span><span
class="p">);</span>
+
+<span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span
class="o">&lt;</span><span class="n">RecordBatch</span><span class="o">&gt;</span>
<span class="n">batch</span> <span class="o">=</span> <span class="n">GetCpuData</span><span
class="p">();</span>
+
+<span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span
class="o">&lt;</span><span class="n">gpu</span><span class="o">::</span><span
class="n">CudaBuffer</span><span class="o">&gt;</span> <span class="n">device_serialized</span><span
class="p">;</span>
+<span class="n">gpu</span><span class="o">::</span><span class="n">SerializeRecordBatch</span><span
class="p">(</span><span class="o">*</span><span class="n">batch</span><span
class="p">,</span> <span class="n">context_</span><span class="p">.</span><span
class="n">get</span><span class="p">(),</span> <span class="o">&amp;</span><span
class="n">device_serialized</span><span class="p">));</span>
+</code></pre>
+</div>
+
+<p>We can then “read” the GPU record batch, but the returned <code class="highlighter-rouge">arrow::RecordBatch</code>
+internally will contain GPU device pointers that you can use for CUDA kernel
+calls:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>std::shared_ptr&lt;RecordBatch&gt;
device_batch;
+gpu::ReadRecordBatch(batch-&gt;schema(), device_serialized,
+                     default_memory_pool(), &amp;device_batch));
+
+// Now run some CUDA kernels on device_batch
+</code></pre>
+</div>
+
+<h2 id="decimal-integration-tests">Decimal Integration Tests</h2>
+
+<p><a href="http://github.com/cpcloud">Phillip Cloud</a> has been working
on decimal support in C++ to enable Parquet
+read/write support in C++ and Python, and also end-to-end testing against the
+Arrow Java libraries.</p>
+
+<p>In the upcoming releases, we hope to complete the remaining data types that
+need end-to-end testing between Java and C++:</p>
+
+<ul>
+  <li>Fixed size lists (variable-size lists already implemented)</li>
+  <li>Fixes size binary</li>
+  <li>Unions</li>
+  <li>Maps</li>
+  <li>Time intervals</li>
+</ul>
+
+<h2 id="other-notable-python-changes">Other Notable Python Changes</h2>
+
+<p>Some highlights of Python development outside of bug fixes and general API
+improvements include:</p>
+
+<ul>
+  <li>Simplified <code class="highlighter-rouge">put</code> and <code
class="highlighter-rouge">get</code> arbitrary Python objects in Plasma objects</li>
+  <li><a href="http://arrow.apache.org/docs/python/ipc.html">High-speed, memory
efficient object serialization</a>. This is important
+enough that we will likely write a dedicated blog post about it.</li>
+  <li>New <code class="highlighter-rouge">flavor='spark'</code> option
to <code class="highlighter-rouge">pyarrow.parquet.write_table</code> to enable
easy
+writing of Parquet files maximized for Spark compatibility</li>
+  <li><code class="highlighter-rouge">parquet.write_to_dataset</code> function
with support for partitioned writes</li>
+  <li>Improved support for Dask filesystems</li>
+  <li>Improved Python usability for IPC: read and write schemas and record batches
+more easily. See the <a href="http://arrow.apache.org/docs/python/api.html">API docs</a>
for more about these.</li>
+</ul>
+
+<h2 id="the-road-ahead">The Road Ahead</h2>
+
+<p>Upcoming Arrow releases will continue to expand the project to cover more use
+cases. In addition to completing end-to-end testing for all the major data
+types, some of us will be shifting attention to building Arrow-native in-memory
+analytics libraries.</p>
+
+<p>We are looking for more JavaScript, R, and other programming language
+developers to join the project and expand the available implementations and
+bindings to more languages.</p>
+
+
+  </div>
+
+  
+
+  
+    
+  <div class="container">
+    <h2>
       Apache Arrow 0.6.0 Release
       <a href="/blog/2017/08/16/0.6.0-release/" class="permalink" title="Permalink">∞</a>
     </h2>


Mime
View raw message