drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject [1/2] drill-site git commit: Drill edits for 1.1
Date Fri, 17 Jul 2015 01:22:58 GMT
Repository: drill-site
Updated Branches:
  refs/heads/asf-site 94a79d4ed -> d0ba86f3b


http://git-wip-us.apache.org/repos/asf/drill-site/blob/d0ba86f3/docs/text-files-csv-tsv-psv/index.html
----------------------------------------------------------------------
diff --git a/docs/text-files-csv-tsv-psv/index.html b/docs/text-files-csv-tsv-psv/index.html
index d7dc8c3..4c411e1 100644
--- a/docs/text-files-csv-tsv-psv/index.html
+++ b/docs/text-files-csv-tsv-psv/index.html
@@ -1030,72 +1030,18 @@ VARCHARS, rather than individual columns. While parquet supports and
Drill reads
 
 <h2 id="configuring-drill-to-read-text-files">Configuring Drill to Read Text Files</h2>
 
-<p>In the storage plugin configuration, you can set the following attributes that affect
how Drill reads CSV, TSV, PSV (comma-, tab-, pipe-separated) files.  </p>
+<p>In the storage plugin configuration, you <a href="/docs/plugin-configuration-basics/#list-of-attributes-and-definitions">set
the attributes</a> that affect how Drill reads CSV, TSV, PSV (comma-, tab-, pipe-separated)
files:  </p>
 
 <ul>
-<li>String lineDelimiter = &quot;\n&quot;;<br>
-One or more characters used to denote a new record. Allows reading files with windows line
endings.<br></li>
-<li>char fieldDelimiter = &#39;,&#39;;<br>
-A single character used to separate each value.<br></li>
-<li>char quote = &#39;&quot;&#39;;<br>
-A single character used to start/end a value enclosed in quotation marks.<br></li>
-<li>char escape = &#39;&quot;&#39;;<br>
-A single character used to escape a quototation mark inside of a value.<br></li>
-<li>char comment = &#39;#&#39;;<br>
-A single character used to denote a comment line.<br></li>
-<li>boolean skipFirstLine = false;<br>
-Set to true to avoid reading headers as data. </li>
+<li>comment<br></li>
+<li>escape<br></li>
+<li>deliimiter<br></li>
+<li>quote<br></li>
+<li>skipFirstLine</li>
 </ul>
 
-<p>For more information about storage plugin configuration, see <a href="/docs/plugin-configuration-basics/#list-of-attributes-and-definitions">&quot;List
of Attributes and Definitions&quot;</a>.</p>
+<p>Set the <code>sys.options</code> property setting <code>exec.storage.enable_new_text_reader</code>
to true (the default) before attempting to use these attributes. </p>
 
-<p>You can deal with a mix of text files with and without headers either by creating
two separate format plugins or by creating two format plugins within the same storage plugin.
The former approach is typically easier than the latter.</p>
-
-<h3 id="creating-two-separate-format-plugins">Creating Two Separate Format Plugins</h3>
-
-<p>Format plugins are associated with a particular storage plugin. Storage plugins
define a root directory that Drill targets when using the storage plugin. You can define separate
storage plugins for different root directories, and define each of the format attributes to
match the files stored below that directory. All files can use the .csv extension, as shown
in the following example:</p>
-
-<p>Storage Plugin A</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">&quot;csv&quot;:
{
-  &quot;type&quot;: &quot;text&quot;,
-  &quot;extensions&quot;: [
-    &quot;csv&quot;
-  ],
-  &quot;delimiter&quot;: &quot;,&quot;
-},
-. . .
-</code></pre></div>
-<p>Storage Plugin B</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">&quot;csv&quot;:
{
-  &quot;type&quot;: &quot;text&quot;,
-  &quot;extensions&quot;: [
-    &quot;csv&quot;
-  ],
-  &quot;comment&quot;: &quot;&amp;&quot;,
-  &quot;skipFirstLine&quot;: true,
-  &quot;delimiter&quot;: &quot;,&quot;
-},
-</code></pre></div>
-<h3 id="creating-two-format-plugins-within-the-same-storage-plugin">Creating Two Format
Plugins within the Same Storage Plugin</h3>
-
-<p>Give a different extension to files with a header and to files without a header,
and use a storage plugin that looks something like the following example. This method requires
renaming some files to use the csv2 extension, as shown in the following example:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">&quot;csv&quot;:
{
-  &quot;type&quot;: &quot;text&quot;,
-  &quot;extensions&quot;: [
-    &quot;csv&quot;
-  ],
-  &quot;delimiter&quot;: &quot;,&quot;
-},
-&quot;csv_with_header&quot;: {
-  &quot;type&quot;: &quot;text&quot;,
-  &quot;extensions&quot;: [
-    &quot;csv2&quot;
-  ],
-  &quot;comment&quot;: &quot;&amp;&quot;,
-  &quot;skipFirstLine&quot;: true,
-  &quot;delimiter&quot;: &quot;,&quot;
-},
-</code></pre></div>
 <h2 id="examples-of-querying-text-files">Examples of Querying Text Files</h2>
 
 <p>The examples in this section show the results of querying CSV files that use and
do not use a header, include comments, and use an escape character:</p>
@@ -1167,6 +1113,57 @@ Set to true to avoid reading headers as data. </li>
 +------------------------+
 7 rows selected (0.111 seconds)
 </code></pre></div>
+<h2 id="strategies-for-using-attributes">Strategies for Using Attributes</h2>
+
+<p>The attributes, such as skipFirstLine, apply to all workspaces defined in a storage
plugin. A typical use case defines separate storage plugins for different root directories
to query the files stored below the directory. An alternative use case defines multiple formats
within the same storage plugin and names target files using different extensions to match
the formats.</p>
+
+<p>You can deal with a mix of text files with and without headers either by creating
two separate format plugins or by creating two format plugins within the same storage plugin.
The former approach is typically easier than the latter.</p>
+
+<h3 id="creating-two-separate-storage-plugin-configurations">Creating Two Separate
Storage Plugin Configurations</h3>
+
+<p>A storage plugin configuration defines a root directory that Drill targets. You
can use a different configuration for each root directory that sets attributes to match the
files stored below that directory. All files can use the same extension, such as .csv, as
shown in the following example:</p>
+
+<p>Storage Plugin A</p>
+<div class="highlight"><pre><code class="language-text" data-lang="text">&quot;csv&quot;:
{
+  &quot;type&quot;: &quot;text&quot;,
+  &quot;extensions&quot;: [
+    &quot;csv&quot;
+  ],
+  &quot;delimiter&quot;: &quot;,&quot;
+},
+. . .
+</code></pre></div>
+<p>Storage Plugin B</p>
+<div class="highlight"><pre><code class="language-text" data-lang="text">&quot;csv&quot;:
{
+  &quot;type&quot;: &quot;text&quot;,
+  &quot;extensions&quot;: [
+    &quot;csv&quot;
+  ],
+  &quot;comment&quot;: &quot;&amp;&quot;,
+  &quot;skipFirstLine&quot;: true,
+  &quot;delimiter&quot;: &quot;,&quot;
+},
+</code></pre></div>
+<h3 id="creating-one-storage-plugin-configuration-to-handle-multiple-formats">Creating
One Storage Plugin Configuration to Handle Multiple Formats</h3>
+
+<p>You can use a different extension for files with and without a header, and use a
storage plugin that looks something like the following example. This method requires renaming
some files to use the csv2 extension.</p>
+<div class="highlight"><pre><code class="language-text" data-lang="text">&quot;csv&quot;:
{
+  &quot;type&quot;: &quot;text&quot;,
+  &quot;extensions&quot;: [
+    &quot;csv&quot;
+  ],
+  &quot;delimiter&quot;: &quot;,&quot;
+},
+&quot;csv_with_header&quot;: {
+  &quot;type&quot;: &quot;text&quot;,
+  &quot;extensions&quot;: [
+    &quot;csv2&quot;
+  ],
+  &quot;comment&quot;: &quot;&amp;&quot;,
+  &quot;skipFirstLine&quot;: true,
+  &quot;delimiter&quot;: &quot;,&quot;
+},
+</code></pre></div>
     
       
         <div class="doc-nav">

http://git-wip-us.apache.org/repos/asf/drill-site/blob/d0ba86f3/docs/workspaces/index.html
----------------------------------------------------------------------
diff --git a/docs/workspaces/index.html b/docs/workspaces/index.html
index 96e001f..b1bfe78 100644
--- a/docs/workspaces/index.html
+++ b/docs/workspaces/index.html
@@ -1003,33 +1003,31 @@
 
     <div class="int_text" align="left">
       
-        <p>When you register an instance of a file system data source, you can configure
-one or more workspaces for the instance. The workspace defines the  directory location of
files in a local or distributed file system. Drill searches the workspace to locate data when
+        <p>You can define one or more workspaces in a storage plugin configuration.
The workspace defines the directory location of files in a local or distributed file system.
Drill searches the workspace to locate data when
 you run a query. The <code>default</code>
 workspace points to the root of the file system. </p>
 
-<p>Configuring <code>workspaces</code> in the storage plugin definition
to include the file location simplifies the query, which is important when querying the same
data source repeatedly. After you configure a long path name in the workspaces location property,
instead of
+<p>Configuring <code>workspaces</code> to include a file location simplifies
the query, which is important when querying the same data source repeatedly. After you configure
a long path name in the workspaces location property, instead of
 using the full path to the data source, you use dot notation in the FROM
 clause.</p>
 
 <p><code>&lt;workspaces&gt;.`&lt;location&gt;</code>`</p>
 
-<p>To query the data source while you are <em>not</em> connected to
-that storage plugin, include the plugin name. This syntax assumes you did not issue a USE
statement to connect to a storage plugin that defines the
+<p>To query the data source while you are not <em>using</em> that storage
plugin, include the plugin name. This syntax assumes you did not issue a USE statement to
connect to a storage plugin that defines the
 location of the data:</p>
 
 <p><code>&lt;plugin&gt;.&lt;workspaces&gt;.`&lt;location&gt;</code>`</p>
 
 <h2 id="no-workspaces-for-hive-and-hbase">No Workspaces for Hive and HBase</h2>
 
-<p>You cannot create workspaces for
-<code>hive</code> and <code>hbase</code> storage plugins, though
Hive databases show up as workspaces in
+<p>You cannot configure workspaces for
+<code>hive</code> and <code>hbase</code>, though Hive databases show
up as workspaces in
 Drill. Each <code>hive</code> instance includes a <code>default</code>
workspace that points to the  Hive metastore. When you query
 files and tables in the <code>hive default</code> workspaces, you can omit the
 workspace name from the query.</p>
 
 <p>For example, you can issue a query on a Hive table in the <code>default workspace</code>
-using either of the following formats and get the same results:</p>
+using either of the following queries and get the same results:</p>
 
 <p><strong>Example</strong></p>
 <div class="highlight"><pre><code class="language-text" data-lang="text">SELECT
* FROM hive.customers LIMIT 10;
@@ -1040,13 +1038,10 @@ SELECT * FROM hive.`default`.customers LIMIT 10;
   <p class="last">Default is a reserved word. You must enclose reserved words in back
ticks.  </p>
 </div>
 
-<p>Because HBase instances do not have workspaces, you can use the following
-format to query a table in HBase:</p>
+<p>Because the HBase storage plugin configuration does not have a workspace, you can
use the following
+query:</p>
 <div class="highlight"><pre><code class="language-text" data-lang="text">SELECT
* FROM hbase.customers LIMIT 10;
 </code></pre></div>
-<p>After you register a data source as a storage plugin instance with Drill, and
-optionally configure workspaces, you can query the data source.</p>
-
     
       
         <div class="doc-nav">

http://git-wip-us.apache.org/repos/asf/drill-site/blob/d0ba86f3/feed.xml
----------------------------------------------------------------------
diff --git a/feed.xml b/feed.xml
index 3e6c72f..8e3f4ea 100644
--- a/feed.xml
+++ b/feed.xml
@@ -6,8 +6,8 @@
 </description>
     <link>/</link>
     <atom:link href="/feed.xml" rel="self" type="application/rss+xml"/>
-    <pubDate>Tue, 07 Jul 2015 18:15:20 -0700</pubDate>
-    <lastBuildDate>Tue, 07 Jul 2015 18:15:20 -0700</lastBuildDate>
+    <pubDate>Thu, 16 Jul 2015 18:16:54 -0700</pubDate>
+    <lastBuildDate>Thu, 16 Jul 2015 18:16:54 -0700</lastBuildDate>
     <generator>Jekyll v2.5.2</generator>
     
       <item>


Mime
View raw message