drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject drill-site git commit: Drill doc updates and edits
Date Tue, 11 Aug 2015 22:06:04 GMT
Repository: drill-site
Updated Branches:
  refs/heads/asf-site 0ed9349ff -> 997b4e86c


Drill doc updates and edits


Project: http://git-wip-us.apache.org/repos/asf/drill-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill-site/commit/997b4e86
Tree: http://git-wip-us.apache.org/repos/asf/drill-site/tree/997b4e86
Diff: http://git-wip-us.apache.org/repos/asf/drill-site/diff/997b4e86

Branch: refs/heads/asf-site
Commit: 997b4e86c0328f9cd187f1f37c697da53e5fc234
Parents: 0ed9349
Author: Bridget Bevens <bbevens@maprtech.com>
Authored: Tue Aug 11 15:05:50 2015 -0700
Committer: Bridget Bevens <bbevens@maprtech.com>
Committed: Tue Aug 11 15:05:50 2015 -0700

----------------------------------------------------------------------
 docs/configuring-user-impersonation/index.html |   4 +-
 docs/core-modules/index.html                   |   8 +--
 docs/data-type-conversion/index.html           |   8 +--
 docs/hive-storage-plugin/index.html            |   2 +-
 docs/img/58.png                                | Bin 35404 -> 35310 bytes
 docs/img/DrillbitModules.png                   | Bin 54907 -> 54615 bytes
 docs/img/drill_imp_simple.PNG                  | Bin 0 -> 34681 bytes
 docs/parquet-format/index.html                 |  15 +++++
 docs/querying-parquet-files/index.html         |  12 ++--
 docs/supported-data-types/index.html           |  70 ++++++++------------
 feed.xml                                       |   4 +-
 11 files changed, 62 insertions(+), 61 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill-site/blob/997b4e86/docs/configuring-user-impersonation/index.html
----------------------------------------------------------------------
diff --git a/docs/configuring-user-impersonation/index.html b/docs/configuring-user-impersonation/index.html
index bbfbc76..d8316fa 100644
--- a/docs/configuring-user-impersonation/index.html
+++ b/docs/configuring-user-impersonation/index.html
@@ -1088,9 +1088,9 @@ ALTER SYSTEM SET `new_view_default_permissions` = &#39;&lt;octal_code&gt;&#39;;
 
 <p>The following example depicts a scenario where the maximum hop number is set to
3, and Drill must impersonate three users to access data when Chad queries a view that Jane
created:</p>
 
-<p><img src="/docs/img/user_hops_no_join.PNG" alt=""></p>
+<p><img src="/docs/img/drill_imp_simple.PNG" alt=""></p>
 
-<p>In the previous example, Joe created V3 from the views that user Frank created.
In the following example, Joe created V3 by joining a view that Frank created with a view
that Bob created. </p>
+<p>In the previous example, Joe created V2 from the view that user Frank created. In
the following example, Joe created V3 by joining a view that Frank created with a view that
Bob created. </p>
 
 <p><img src="/docs/img/user_hops_joined_view.PNG" alt="">  </p>
 

http://git-wip-us.apache.org/repos/asf/drill-site/blob/997b4e86/docs/core-modules/index.html
----------------------------------------------------------------------
diff --git a/docs/core-modules/index.html b/docs/core-modules/index.html
index 15bbba9..dea232b 100644
--- a/docs/core-modules/index.html
+++ b/docs/core-modules/index.html
@@ -1013,7 +1013,7 @@
 
 <ul>
 <li><p><strong>RPC end point</strong>: Drill exposes a low overhead
protobuf-based RPC protocol to communicate with the clients. Additionally, a C++ and Java
API layers are also available for the client applications to interact with Drill. Clients
can communicate to a specific Drillbit directly or go through a ZooKeeper quorum to discover
the available Drillbits before submitting queries. It is recommended that the clients always
go through ZooKeeper to shield clients from the intricacies of cluster management, such as
the addition or removal of nodes. </p></li>
-<li><p><strong>SQL parser</strong>: Drill uses Optiq, the open source
framework, to parse incoming queries. The output of the parser component is a language agnostic,
computer-friendly logical plan that represents the query. </p></li>
+<li><p><strong>SQL parser</strong>: Drill uses <a href="https://calcite.incubator.apache.org/">Calcite</a>,
the open source framework, to parse incoming queries. The output of the parser component is
a language agnostic, computer-friendly logical plan that represents the query. </p></li>
 <li><p><strong>Storage plugin interfaces</strong>: Drill serves as
a query layer on top of several data sources. Storage plugins in Drill represent the abstractions
that Drill uses to interact with the data sources. Storage plugins provide Drill with the
following information:</p>
 
 <ul>
@@ -1023,11 +1023,11 @@
 </ul>
 
 <p>In the context of Hadoop, Drill provides storage plugins for files and
-HBase/M7. Drill also integrates with Hive as a storage plugin since Hive
-provides a metadata abstraction layer on top of files, HBase/M7, and provides
+HBase. Drill also integrates with Hive as a storage plugin since Hive
+provides a metadata abstraction layer on top of files, HBase, and provides
 libraries to read data and operate on these sources (Serdes and UDFs).</p>
 
-<p>When users query files and HBase/M7 with Drill, they can do it directly or go
+<p>When users query files and HBase with Drill, they can do it directly or go
 through Hive if they have metadata defined there. Drill integration with Hive
 is only for metadata. Drill does not invoke the Hive execution engine for any
 requests.</p></li>

http://git-wip-us.apache.org/repos/asf/drill-site/blob/997b4e86/docs/data-type-conversion/index.html
----------------------------------------------------------------------
diff --git a/docs/data-type-conversion/index.html b/docs/data-type-conversion/index.html
index 59854b9..cf18601 100644
--- a/docs/data-type-conversion/index.html
+++ b/docs/data-type-conversion/index.html
@@ -1390,7 +1390,7 @@ FROM tmp.`json2parquet2`;
 </code></pre></div>
 <p><em>expression</em> is a byte array, such as {(byte)0xca, (byte)0xfe,
(byte)0xba, (byte)0xbe}</p>
 
-<p>This function returns a hexadecimal string, such as &quot;\xca\xfe\xba\xbe&quot;.
You can use this function with CONVERT_TO when you want to test the effects of a conversion.</p>
+<p>This function returns a hexadecimal string, such as <code>&quot;\xca\xfe\xba\xbe&quot;</code>.
You can use this function with CONVERT_TO when you want to test the effects of a conversion.</p>
 
 <h3 id="string_binary-examples">STRING_BINARY Examples</h3>
 <div class="highlight"><pre><code class="language-text" data-lang="text">SELECT
@@ -1428,7 +1428,7 @@ FROM (VALUES (1));
 <h3 id="binary_string-syntax">BINARY_STRING Syntax</h3>
 <div class="highlight"><pre><code class="language-text" data-lang="text">BINARY_STRING(expression)
 </code></pre></div>
-<p><em>expression</em> is a hexadecimal string, such as &quot;\xca\xfe\xba\xbe&quot;.</p>
+<p><em>expression</em> is a hexadecimal string, such as <code>&quot;\xca\xfe\xba\xbe&quot;</code>.</p>
 
 <p>This function returns a byte array, such as {(byte)0xca, (byte)0xfe, (byte)0xba,
(byte)0xbe}. You can use this function with CONVERT_FROM for readable results.</p>
 
@@ -1441,11 +1441,11 @@ FROM (VALUES (1));
 </code></pre></div>
 <p><em>expression</em> is a byte array, such as {(byte)0xca, (byte)0xfe,
(byte)0xba, (byte)0xbe}.</p>
 
-<p>This function returns a hexadecimal-encoded string, such as &quot;\xca\xfe\xba\xbe&quot;.
You can use this function with CONVERT_TO for readable results.</p>
+<p>This function returns a hexadecimal-encoded string, such as <code>&quot;\xca\xfe\xba\xbe&quot;</code>.
You can use this function with CONVERT_TO for readable results.</p>
 
 <h3 id="binary_string-examples">BINARY_STRING Examples</h3>
 
-<p>Decode the hexadecimal string 000000C8 expressed in four octets \x00\x00\x00\xC8
into its big endian four-byte integer equivalent. </p>
+<p>Decode the hexadecimal string 000000C8 expressed in four octets <code>\x00\x00\x00\xC8</code>
into its big endian four-byte integer equivalent. </p>
 <div class="highlight"><pre><code class="language-text" data-lang="text">SELECT
CONVERT_FROM(BINARY_STRING(&#39;\x00\x00\x00\xC8&#39;), &#39;INT_BE&#39;)
AS cnvrt
 FROM (VALUES (1));
 </code></pre></div>

http://git-wip-us.apache.org/repos/asf/drill-site/blob/997b4e86/docs/hive-storage-plugin/index.html
----------------------------------------------------------------------
diff --git a/docs/hive-storage-plugin/index.html b/docs/hive-storage-plugin/index.html
index 3214649..499348c 100644
--- a/docs/hive-storage-plugin/index.html
+++ b/docs/hive-storage-plugin/index.html
@@ -1093,7 +1093,7 @@ steps:</p>
     }
   }
 </code></pre></div></li>
-<li><p>Change the <code>&quot;fs.default.name&quot;</code>
attribute to specify the default location of files. The value needs to be a URI that is available
and capable of handling file system requests. For example, change the local file system URI
<code>&quot;file:///&quot;</code> to the HDFS URI: <code>hdfs://</code>,
or to the path on HDFS with a namenode: <code>hdfs://&lt;authority&gt;:&lt;port&gt;</code></p></li>
+<li><p>Change the <code>&quot;fs.default.name&quot;</code>
attribute to specify the default location of files. The value needs to be a URI that is available
and capable of handling file system requests. For example, change the local file system URI
<code>&quot;file:///&quot;</code> to the HDFS URI: <code>hdfs://</code>,
or to the path on HDFS with a namenode: <code>hdfs://&lt;hostname&gt;:&lt;port&gt;</code></p></li>
 <li><p>Click <strong>Enable</strong>.</p></li>
 </ol>
 

http://git-wip-us.apache.org/repos/asf/drill-site/blob/997b4e86/docs/img/58.png
----------------------------------------------------------------------
diff --git a/docs/img/58.png b/docs/img/58.png
index b957927..3975d86 100644
Binary files a/docs/img/58.png and b/docs/img/58.png differ

http://git-wip-us.apache.org/repos/asf/drill-site/blob/997b4e86/docs/img/DrillbitModules.png
----------------------------------------------------------------------
diff --git a/docs/img/DrillbitModules.png b/docs/img/DrillbitModules.png
index 2eb9904..4ebdee4 100644
Binary files a/docs/img/DrillbitModules.png and b/docs/img/DrillbitModules.png differ

http://git-wip-us.apache.org/repos/asf/drill-site/blob/997b4e86/docs/img/drill_imp_simple.PNG
----------------------------------------------------------------------
diff --git a/docs/img/drill_imp_simple.PNG b/docs/img/drill_imp_simple.PNG
new file mode 100644
index 0000000..747e6f1
Binary files /dev/null and b/docs/img/drill_imp_simple.PNG differ

http://git-wip-us.apache.org/repos/asf/drill-site/blob/997b4e86/docs/parquet-format/index.html
----------------------------------------------------------------------
diff --git a/docs/parquet-format/index.html b/docs/parquet-format/index.html
index b224240..0ba00c6 100644
--- a/docs/parquet-format/index.html
+++ b/docs/parquet-format/index.html
@@ -1028,6 +1028,21 @@
 
 <p>When a read of Parquet data occurs, Drill loads only the necessary columns of data,
which reduces I/O. Reading only a small piece of the Parquet data from a data file or table,
Drill can examine and analyze all values for a column across multiple files. You can create
a Drill table from one format and store the data in another format, including Parquet.</p>
 
+<!-- ## Caching Metadata
+
+For performant querying of a large number of files, Drill can take advantage of metadata,
such as the Hive metadata store, and includes the capability of generating a metadata cache
for performant querying of thousands of Parquet files. The metadata cache is not a central
caching system, but simply one or more files of metadata. Drill generates and saves a cache
of metadata in each directory in nested directories. You trigger the generation of metadata
caches by running the REFRESH TABLE METADATA command, as described in [Querying Parquet Files](/docs/querying-parquet-files/).
+
+After generating the metadata cache, Drill performs the following tasks during the planning
phase for a query on a directory of Parquet files:
+
+* Finds files.  
+* Recurses directories.  
+* Reads the footers of files to get information, such as row counts and HDFS block locations
for every file for Drill to assign work based on locality.  
+  When Drill reads the file, it attempts to execute the query on the node where the data
rests.  
+* Summarizes the information from the footers in a single metadata cache file.  
+* Stores the metadata cache file at each level that covers that particular level and all
lower levels.
+
+At execution time, Drill reads the actual files. At planning time, Drill reads only the metadata
file. -->
+
 <h2 id="writing-parquet-files">Writing Parquet Files</h2>
 
 <p>CREATE TABLE AS (CTAS) can use any data source provided by the storage plugin. To
write Parquet data using the CTAS command, set the session store.format option as shown in
the next section. Alternatively, configure the storage plugin to point to the directory containing
the Parquet files.</p>

http://git-wip-us.apache.org/repos/asf/drill-site/blob/997b4e86/docs/querying-parquet-files/index.html
----------------------------------------------------------------------
diff --git a/docs/querying-parquet-files/index.html b/docs/querying-parquet-files/index.html
index bb4dd9c..dd358ed 100644
--- a/docs/querying-parquet-files/index.html
+++ b/docs/querying-parquet-files/index.html
@@ -1007,17 +1007,19 @@
 
     <div class="int_text" align="left">
       
-        <!-- Drill 1.2 extends SQL for performant querying of Parquet files. By including
a command in a query to cache Parquet file metadata, you trigger the generation of metadata
files in the directory of Parquet files and its subdirectories. 
+        <!-- Drill extends SQL for performant querying of a large number, thousands or
more, of Parquet files. By including the following command in a query, you trigger the generation
of metadata files in the directory of Parquet files and its subdirectories:
 
-You need to include the command in only the first query of a file or directory. Subsequent
can queries return results quickly because Drill refers to the cached metadata. Drill updates
metadata automatically when the Parquet files change by comparing timestamps of data. 
+    REFRESH TABLE METADATA <path to table>
 
-To generate metadata, use the following command:
+You need to include the command in only the first query of a file or directory. Subsequent
queries return results quickly because Drill refers to the metadata saved in the cache, as
described in [Reading Parquet Files](/docs/parquet-format/#reading-parquet-files). 
+
+You can query nested directories from any level. For example, you can query a sub-sub-directory
of Parquet files because Drill stores a metadata cache of information at each level that covers
that particular level and all lower levels. 
 
 ## Example of Generating Parquet Metadata
 
- -->
+TBD (fill in when the feature is ready)
 
-<h2 id="sample-parquet-files">Sample Parquet Files</h2>
+## Sample Parquet Files -->
 
 <p>The Drill installation includes a <code>sample-data</code> directory
with Parquet files
 that you can query. Use SQL syntax to query the <code>region.parquet</code> and

http://git-wip-us.apache.org/repos/asf/drill-site/blob/997b4e86/docs/supported-data-types/index.html
----------------------------------------------------------------------
diff --git a/docs/supported-data-types/index.html b/docs/supported-data-types/index.html
index af365f2..da1e0c2 100644
--- a/docs/supported-data-types/index.html
+++ b/docs/supported-data-types/index.html
@@ -1007,7 +1007,7 @@
 
     <div class="int_text" align="left">
       
-        <p>Drill reads from and writes to data sources having a wide variety of types.
Drill uses data types at the RPC level that are not supported for query input, often implicitly
casting data. Drill supports the following SQL data types for query input:</p>
+        <p>Drill reads from and writes to data sources having a wide variety of types.
Drill uses data types at the RPC level that are not supported for query input, such as INTERVALDAY
and INTERVALYEAR types, often implicitly casting data. Drill supports the following SQL data
types for query input:</p>
 
 <table><thead>
 <tr>
@@ -1057,14 +1057,9 @@
 <td>2147483646</td>
 </tr>
 <tr>
-<td>INTERVALDAY</td>
-<td>A period of time in days, hours, minutes, and seconds only</td>
-<td>&#39;1 10:20:30.123&#39;</td>
-</tr>
-<tr>
-<td>INTERVALYEAR</td>
-<td>A period of time in years and months only</td>
-<td>&#39;1-2&#39; year to month</td>
+<td>INTERVAL</td>
+<td>A period of time in days, hours, minutes, and seconds only (INTERVALDAY) or in
years and months (INTERVALYEAR)</td>
+<td>&#39;1 10:20:30.123&#39; (INTERVALDAY) or &#39;1-2&#39; year to
month (INTERVALYEAR)</td>
 </tr>
 <tr>
 <td>SMALLINT**</td>
@@ -1172,7 +1167,7 @@ Implicitly casts all textual data to VARCHAR.</li>
 <p>The following list includes data types Drill uses in descending order of precedence.
Casting precedence shown in the following table applies to the implicit casting that Drill
performs. For example, Drill might implicitly cast data when a query includes a function or
filter on mismatched data types:</p>
 <div class="highlight"><pre><code class="language-text" data-lang="text">SELECT
myBigInt FROM mytable WHERE myBigInt = 2.5;
 </code></pre></div>
-<p>As shown in the table, Drill can cast a NULL value, which has the lowest precedence,
to any other type; you can cast a SMALLINT (not supported in this release) value to INT. Drill
might deviate from these precedence rules for performance reasons. Under certain circumstances,
such as queries involving SUBSTR and CONCAT functions, Drill reverses the order of precedence
and allows a cast to VARCHAR from a type of higher precedence than VARCHAR, such as BIGINT.</p>
+<p>As shown in the table, Drill can cast a NULL value, which has the lowest precedence,
to any other type; you can cast a SMALLINT (not supported in this release) value to INT. Drill
might deviate from these precedence rules for performance reasons. Under certain circumstances,
such as queries involving SUBSTR and CONCAT functions, Drill reverses the order of precedence
and allows a cast to VARCHAR from a type of higher precedence than VARCHAR, such as BIGINT.
The INTERVALDAY and INTERVALYEAR types are internal types.</p>
 
 <h3 id="casting-precedence">Casting Precedence</h3>
 
@@ -1187,68 +1182,68 @@ Implicitly casts all textual data to VARCHAR.</li>
 <tr>
 <td>1</td>
 <td>INTERVALYEAR (highest)</td>
-<td>11</td>
-<td>INT</td>
+<td>12</td>
+<td>UINT2</td>
 </tr>
 <tr>
 <td>2</td>
 <td>INTERVALDAY</td>
-<td>12</td>
-<td>UINT2</td>
+<td>13</td>
+<td>SMALLINT*</td>
 </tr>
 <tr>
 <td>3</td>
 <td>TIMESTAMP</td>
-<td>13</td>
-<td>SMALLINT*</td>
+<td>14</td>
+<td>UINT1</td>
 </tr>
 <tr>
 <td>4</td>
 <td>DATE</td>
-<td>14</td>
-<td>UINT1</td>
+<td>15</td>
+<td>VAR16CHAR</td>
 </tr>
 <tr>
 <td>5</td>
 <td>TIME</td>
-<td>15</td>
-<td>VAR16CHAR</td>
+<td>16</td>
+<td>FIXED16CHAR</td>
 </tr>
 <tr>
 <td>6</td>
 <td>DOUBLE</td>
-<td>16</td>
-<td>FIXED16CHAR</td>
+<td>17</td>
+<td>VARCHAR</td>
 </tr>
 <tr>
 <td>7</td>
 <td>DECIMAL</td>
-<td>17</td>
-<td>VARCHAR</td>
+<td>18</td>
+<td>CHAR</td>
 </tr>
 <tr>
 <td>8</td>
 <td>UINT8</td>
-<td>18</td>
-<td>CHAR</td>
+<td>19</td>
+<td>VARBINARY</td>
 </tr>
 <tr>
 <td>9</td>
 <td>BIGINT</td>
-<td>19</td>
-<td>VARBINARY</td>
+<td>20</td>
+<td>FIXEDBINARY</td>
 </tr>
 <tr>
 <td>10</td>
 <td>UINT4</td>
-<td>20</td>
-<td>FIXEDBINARY</td>
+<td>21</td>
+<td>NULL (lowest)</td>
 </tr>
 <tr>
+<td>11</td>
+<td>INT</td>
 <td></td>
 <td></td>
-<td>21</td>
-<td>NULL (lowest)</td>
 </tr>
 </tbody></table>
 
@@ -1448,7 +1443,6 @@ Converts a string to TIMESTAMP.</li>
 <th>DATE</th>
 <th>TIME</th>
 <th>TIMESTAMP</th>
-<th>INTERVALDAY</th>
 <th>INTERVALYEAR</th>
 <th>INTERVALDAY</th>
 </tr>
@@ -1460,7 +1454,6 @@ Converts a string to TIMESTAMP.</li>
 <td></td>
 <td></td>
 <td></td>
-<td></td>
 </tr>
 <tr>
 <td>CHAR</td>
@@ -1469,7 +1462,6 @@ Converts a string to TIMESTAMP.</li>
 <td>Yes</td>
 <td>Yes</td>
 <td>Yes</td>
-<td>Yes</td>
 </tr>
 <tr>
 <td>FIXEDBINARY*</td>
@@ -1478,7 +1470,6 @@ Converts a string to TIMESTAMP.</li>
 <td>No</td>
 <td>No</td>
 <td>No</td>
-<td>No</td>
 </tr>
 <tr>
 <td>VARCHAR</td>
@@ -1487,7 +1478,6 @@ Converts a string to TIMESTAMP.</li>
 <td>Yes</td>
 <td>Yes</td>
 <td>Yes</td>
-<td>Yes</td>
 </tr>
 <tr>
 <td>VARBINARY*</td>
@@ -1496,7 +1486,6 @@ Converts a string to TIMESTAMP.</li>
 <td>Yes</td>
 <td>No</td>
 <td>No</td>
-<td>No</td>
 </tr>
 <tr>
 <td>DATE</td>
@@ -1505,7 +1494,6 @@ Converts a string to TIMESTAMP.</li>
 <td>Yes</td>
 <td>No</td>
 <td>No</td>
-<td>No</td>
 </tr>
 <tr>
 <td>TIME</td>
@@ -1514,7 +1502,6 @@ Converts a string to TIMESTAMP.</li>
 <td>Yes</td>
 <td>No</td>
 <td>No</td>
-<td>No</td>
 </tr>
 <tr>
 <td>TIMESTAMP</td>
@@ -1523,14 +1510,12 @@ Converts a string to TIMESTAMP.</li>
 <td>Yes</td>
 <td>No</td>
 <td>No</td>
-<td>No</td>
 </tr>
 <tr>
 <td>INTERVALYEAR</td>
 <td>Yes</td>
 <td>No</td>
 <td>Yes</td>
-<td>Yes</td>
 <td>No</td>
 <td>Yes</td>
 </tr>
@@ -1540,7 +1525,6 @@ Converts a string to TIMESTAMP.</li>
 <td>No</td>
 <td>Yes</td>
 <td>Yes</td>
-<td>Yes</td>
 <td>No</td>
 </tr>
 </tbody></table>

http://git-wip-us.apache.org/repos/asf/drill-site/blob/997b4e86/feed.xml
----------------------------------------------------------------------
diff --git a/feed.xml b/feed.xml
index 73d22bb..ad6fa9f 100644
--- a/feed.xml
+++ b/feed.xml
@@ -6,8 +6,8 @@
 </description>
     <link>/</link>
     <atom:link href="/feed.xml" rel="self" type="application/rss+xml"/>
-    <pubDate>Fri, 07 Aug 2015 15:55:52 -0700</pubDate>
-    <lastBuildDate>Fri, 07 Aug 2015 15:55:52 -0700</lastBuildDate>
+    <pubDate>Tue, 11 Aug 2015 14:56:15 -0700</pubDate>
+    <lastBuildDate>Tue, 11 Aug 2015 14:56:15 -0700</lastBuildDate>
     <generator>Jekyll v2.5.2</generator>
     
       <item>


Mime
View raw message