drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject [02/12] drill git commit: DRILL-2316: Add hive, parquet, json ref docs, basics tutorial, and minor edits
Date Tue, 17 Mar 2015 21:02:43 GMT
http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/manage/002-start-stop.md
----------------------------------------------------------------------
diff --git a/_docs/manage/002-start-stop.md b/_docs/manage/002-start-stop.md
index 76a76f4..d37f840 100644
--- a/_docs/manage/002-start-stop.md
+++ b/_docs/manage/002-start-stop.md
@@ -28,7 +28,7 @@ can indicate the schema name when you invoke SQLLine.
 To start SQLLine, issue the appropriate command for your Drill installation
 type:
 
-<table ><tbody><tr><td valign="top"><strong>Drill Install Type</strong></td><td valign="top"><strong>Example</strong></td><td valign="top"><strong>Command</strong></td></tr><tr><td valign="top">Embedded</td><td valign="top">Drill installed locally (embedded mode);Hive with embedded metastore</td><td valign="top">To connect without specifying a schema, navigate to the Drill installation directory and issue the following command:<code>$ bin/sqlline -u jdbc:drill:zk=local -n admin -p admin </code><span> </span>Once you are in the prompt, you can issue<code> USE &lt;schema&gt; </code>or you can use absolute notation: <code>schema.table.column.</code>To connect to a schema directly, issue the command with the schema name:<code>$ bin/sqlline -u jdbc:drill:schema=&lt;database&gt;;zk=local -n admin -p admin</code></td></tr><tr><td valign="top">Distributed</td><td valign="top">Drill installed in distributed mode;Hive with remote metastore;HBase</td><td valign="top">To connect without specify
 ing a schema, navigate to the Drill installation directory and issue the following command:<code>$ bin/sqlline -u jdbc:drill:zk=&lt;zk1host&gt;:&lt;port&gt;,&lt;zk2host&gt;:&lt;port&gt;,&lt;zk3host&gt;:&lt;port&gt; -n admin -p admin</code>Once you are in the prompt, you can issue<code> USE &lt;schema&gt; </code>or you can use absolute notation: <code>schema.table.column.</code>To connect to a schema directly, issue the command with the schema name:<code>$ bin/sqlline -u jdbc:drill:schema=&lt;database&gt;;zk=&lt;zk1host&gt;:&lt;port&gt;,&lt;zk2host&gt;:&lt;port&gt;,&lt;zk3host&gt;:&lt;port&gt; -n admin -p admin</code></td></tr></tbody></table>
+<table ><tbody><tr><td valign="top"><strong>Drill Install Type</strong></td><td valign="top"><strong>Example</strong></td><td valign="top"><strong>Command</strong></td></tr><tr><td valign="top">Embedded</td><td valign="top">Drill installed locally (embedded mode);Hive with embedded metastore</td><td valign="top">To connect without specifying a schema, navigate to the Drill installation directory and issue the following command:<code>$ bin/sqlline -u jdbc:drill:zk=local -n admin -p admin </code><span> </span>Once you are in the prompt, you can issue<code> USE &lt;schema&gt; </code>or you can use absolute notation: <code>schema.table.column.</code>To connect to a schema directly, issue the command with the schema name:<code>$ bin/sqlline -u jdbc:drill:schema=&lt;database&gt;;zk=local -n admin -p admin</code></td></tr><tr><td valign="top">Distributed</td><td valign="top">Drill installed in distributed mode;Hive with remote metastore;HBase</td><td valign="top">To connect without specify
 ing a schema, navigate to the Drill installation directory and issue the following command:<code>$ bin/sqlline -u jdbc:drill:zk=&lt;zk1host&gt;:&lt;port&gt;,&lt;zk2host&gt;:&lt;port&gt;,&lt;zk3host&gt;:&lt;port&gt; -n admin -p admin</code>Once you are in the prompt, you can issue<code> USE &lt;schema&gt; </code>or you can use absolute notation: <code>schema.table.column.</code>To connect to a schema directly, issue the command with the schema name:<code>$ bin/sqlline -u jdbc:drill:schema=&lt;database&gt;;zk=&lt;zk1host&gt;:&lt;port&gt;,&lt;zk2host&gt;:&lt;port&gt;,&lt;zk3host&gt;:&lt;port&gt; -n admin -p admin</code></td></tr></tbody></table></div>
   
 When SQLLine starts, the system displays the following prompt:
 

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/manage/003-ports.md
----------------------------------------------------------------------
diff --git a/_docs/manage/003-ports.md b/_docs/manage/003-ports.md
index df1d362..c72beff 100644
--- a/_docs/manage/003-ports.md
+++ b/_docs/manage/003-ports.md
@@ -5,5 +5,5 @@ parent: "Manage Drill"
 The following table provides a list of the ports that Drill uses, the port
 type, and a description of how Drill uses the port:
 
-<table ><tbody><tr><th >Port</th><th colspan="1" >Type</th><th >Description</th></tr><tr><td valign="top" >8047</td><td valign="top" colspan="1" >TCP</td><td valign="top" >Needed for <span style="color: rgb(34,34,34);">the Drill Web UI.</span><span style="color: rgb(34,34,34);"> </span></td></tr><tr><td valign="top" >31010</td><td valign="top" colspan="1" >TCP</td><td valign="top" >User port address. Used between nodes in a Drill cluster. <br />Needed for an external client, such as Tableau, to connect into the<br />cluster nodes. Also needed for the Drill Web UI.</td></tr><tr><td valign="top" >31011</td><td valign="top" colspan="1" >TCP</td><td valign="top" >Control port address. Used between nodes in a Drill cluster. <br />Needed for multi-node installation of Apache Drill.</td></tr><tr><td valign="top" colspan="1" >31012</td><td valign="top" colspan="1" >TCP</td><td valign="top" colspan="1" >Data port address. Used between nodes in a Drill cluster. <br />Needed for multi-node ins
 tallation of Apache Drill.</td></tr><tr><td valign="top" colspan="1" >46655</td><td valign="top" colspan="1" >UDP</td><td valign="top" colspan="1" >Used for JGroups and Infinispan. Needed for multi-node installation of Apache Drill.</td></tr></tbody></table>
+<table ><tbody><tr><th >Port</th><th colspan="1" >Type</th><th >Description</th></tr><tr><td valign="top" >8047</td><td valign="top" colspan="1" >TCP</td><td valign="top" >Needed for <span style="color: rgb(34,34,34);">the Drill Web UI.</span><span style="color: rgb(34,34,34);"> </span></td></tr><tr><td valign="top" >31010</td><td valign="top" colspan="1" >TCP</td><td valign="top" >User port address. Used between nodes in a Drill cluster. <br />Needed for an external client, such as Tableau, to connect into the<br />cluster nodes. Also needed for the Drill Web UI.</td></tr><tr><td valign="top" >31011</td><td valign="top" colspan="1" >TCP</td><td valign="top" >Control port address. Used between nodes in a Drill cluster. <br />Needed for multi-node installation of Apache Drill.</td></tr><tr><td valign="top" colspan="1" >31012</td><td valign="top" colspan="1" >TCP</td><td valign="top" colspan="1" >Data port address. Used between nodes in a Drill cluster. <br />Needed for multi-node ins
 tallation of Apache Drill.</td></tr><tr><td valign="top" colspan="1" >46655</td><td valign="top" colspan="1" >UDP</td><td valign="top" colspan="1" >Used for JGroups and Infinispan. Needed for multi-node installation of Apache Drill.</td></tr></tbody></table></div>
 

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/manage/conf/002-startup-opt.md
----------------------------------------------------------------------
diff --git a/_docs/manage/conf/002-startup-opt.md b/_docs/manage/conf/002-startup-opt.md
index 3434401..9db8b45 100644
--- a/_docs/manage/conf/002-startup-opt.md
+++ b/_docs/manage/conf/002-startup-opt.md
@@ -46,5 +46,5 @@ override.conf` file located in Drill’s` /conf` directory.
 You may want to configure the following start-up options that control certain
 behaviors in Drill:
 
-<table ><tbody><tr><th >Option</th><th >Default Value</th><th >Description</th></tr><tr><td valign="top" >drill.exec.sys.store.provider</td><td valign="top" >ZooKeeper</td><td valign="top" >Defines the persistent storage (PStore) provider. The PStore holds configuration and profile data. For more information about PStores, see <a href="/drill/docs/persistent-configuration-storage" rel="nofollow">Persistent Configuration Storage</a>.</td></tr><tr><td valign="top" >drill.exec.buffer.size</td><td valign="top" > </td><td valign="top" >Defines the amount of memory available, in terms of record batches, to hold data on the downstream side of an operation. Drill pushes data downstream as quickly as possible to make data immediately available. This requires Drill to use memory to hold the data pending operations. When data on a downstream operation is required, that data is immediately available so Drill does not have to go over the network to process it. Providing more memory to this optio
 n increases the speed at which Drill completes a query.</td></tr><tr><td valign="top" >drill.exec.sort.external.directoriesdrill.exec.sort.external.fs</td><td valign="top" > </td><td valign="top" >These options control spooling. The drill.exec.sort.external.directories option tells Drill which directory to use when spooling. The drill.exec.sort.external.fs option tells Drill which file system to use when spooling beyond memory files. <span style="line-height: 1.4285715;background-color: transparent;"> </span>Drill uses a spool and sort operation for beyond memory operations. The sorting operation is designed to spool to a Hadoop file system. The default Hadoop file system is a local file system in the /tmp directory. Spooling performance (both writing and reading back from it) is constrained by the file system. <span style="line-height: 1.4285715;background-color: transparent;"> </span>For MapR clusters, use MapReduce volumes or set up local volumes to use for spooling purposes. Vol
 umes improve performance and stripe data across as many disks as possible.</td></tr><tr><td valign="top" colspan="1" >drill.exec.debug.error_on_leak</td><td valign="top" colspan="1" >True</td><td valign="top" colspan="1" >Determines how Drill behaves when memory leaks occur during a query. By default, this option is enabled so that queries fail when memory leaks occur. If you disable the option, Drill issues a warning when a memory leak occurs and completes the query.</td></tr><tr><td valign="top" colspan="1" >drill.exec.zk.connect</td><td valign="top" colspan="1" >localhost:2181</td><td valign="top" colspan="1" >Provides Drill with the ZooKeeper quorum to use to connect to data sources. Change this setting to point to the ZooKeeper quorum that you want Drill to use. You must configure this option on each Drillbit node.</td></tr><tr><td valign="top" colspan="1" >drill.exec.cluster-id</td><td valign="top" colspan="1" >my_drillbit_cluster</td><td valign="top" colspan="1" >Identifies t
 he cluster that corresponds with the ZooKeeper quorum indicated. It also provides Drill with the name of the cluster used during UDP multicast. You must change the default cluster-id if there are multiple clusters on the same subnet. If you do not change the ID, the clusters will try to connect to each other to create one cluster.</td></tr></tbody></table>
+<table ><tbody><tr><th >Option</th><th >Default Value</th><th >Description</th></tr><tr><td valign="top" >drill.exec.sys.store.provider</td><td valign="top" >ZooKeeper</td><td valign="top" >Defines the persistent storage (PStore) provider. The PStore holds configuration and profile data. For more information about PStores, see <a href="/drill/docs/persistent-configuration-storage" rel="nofollow">Persistent Configuration Storage</a>.</td></tr><tr><td valign="top" >drill.exec.buffer.size</td><td valign="top" > </td><td valign="top" >Defines the amount of memory available, in terms of record batches, to hold data on the downstream side of an operation. Drill pushes data downstream as quickly as possible to make data immediately available. This requires Drill to use memory to hold the data pending operations. When data on a downstream operation is required, that data is immediately available so Drill does not have to go over the network to process it. Providing more memory to this optio
 n increases the speed at which Drill completes a query.</td></tr><tr><td valign="top" >drill.exec.sort.external.directoriesdrill.exec.sort.external.fs</td><td valign="top" > </td><td valign="top" >These options control spooling. The drill.exec.sort.external.directories option tells Drill which directory to use when spooling. The drill.exec.sort.external.fs option tells Drill which file system to use when spooling beyond memory files. <span style="line-height: 1.4285715;background-color: transparent;"> </span>Drill uses a spool and sort operation for beyond memory operations. The sorting operation is designed to spool to a Hadoop file system. The default Hadoop file system is a local file system in the /tmp directory. Spooling performance (both writing and reading back from it) is constrained by the file system. <span style="line-height: 1.4285715;background-color: transparent;"> </span>For MapR clusters, use MapReduce volumes or set up local volumes to use for spooling purposes. Vol
 umes improve performance and stripe data across as many disks as possible.</td></tr><tr><td valign="top" colspan="1" >drill.exec.debug.error_on_leak</td><td valign="top" colspan="1" >True</td><td valign="top" colspan="1" >Determines how Drill behaves when memory leaks occur during a query. By default, this option is enabled so that queries fail when memory leaks occur. If you disable the option, Drill issues a warning when a memory leak occurs and completes the query.</td></tr><tr><td valign="top" colspan="1" >drill.exec.zk.connect</td><td valign="top" colspan="1" >localhost:2181</td><td valign="top" colspan="1" >Provides Drill with the ZooKeeper quorum to use to connect to data sources. Change this setting to point to the ZooKeeper quorum that you want Drill to use. You must configure this option on each Drillbit node.</td></tr><tr><td valign="top" colspan="1" >drill.exec.cluster-id</td><td valign="top" colspan="1" >my_drillbit_cluster</td><td valign="top" colspan="1" >Identifies t
 he cluster that corresponds with the ZooKeeper quorum indicated. It also provides Drill with the name of the cluster used during UDP multicast. You must change the default cluster-id if there are multiple clusters on the same subnet. If you do not change the ID, the clusters will try to connect to each other to create one cluster.</td></tr></tbody></table></div>
 

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/manage/conf/003-plan-exec.md
----------------------------------------------------------------------
diff --git a/_docs/manage/conf/003-plan-exec.md b/_docs/manage/conf/003-plan-exec.md
index ea67e2d..56a1f69 100644
--- a/_docs/manage/conf/003-plan-exec.md
+++ b/_docs/manage/conf/003-plan-exec.md
@@ -28,8 +28,7 @@ at the system or session level:
 <script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[number of active drillbits (typically one per node) 
 * number of cores per node
 * 0.7]]></script>
-<p>For example, on a single-node test system with 2 cores and hyper-threading enabled:</p>
-<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[1 * 4 * 0.7 = 3]]></script>
+<p>For example, on a single-node test system with 2 cores and hyper-threading enabled:</p><script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[1 * 4 * 0.7 = 3]]></script>
 <p>When you modify the default setting, you can supply any meaningful number. The system does not automatically scale down your setting.</p></td></tr><tr><td valign="top" colspan="1" >planner.width.max_per_query</td><td valign="top" colspan="1" >1000</td><td valign="top" colspan="1" ><p>The max_per_query value also sets the maximum degree of parallelism for any given stage of a query, but the setting applies to the query as executed by the whole cluster (multiple nodes). In effect, the actual maximum width per query is the <em>minimum of two values</em>:</p>
 <script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[min((number of nodes * width.max_per_node), width.max_per_query)]]></script>
 <p>For example, on a 4-node cluster where <span><code>width.max_per_node</code> is set to 6 and </span><span><code>width.max_per_query</code> is set to 30:</span></p>

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/manage/conf/004-persist-conf.md
----------------------------------------------------------------------
diff --git a/_docs/manage/conf/004-persist-conf.md b/_docs/manage/conf/004-persist-conf.md
index b1deefa..12439a5 100644
--- a/_docs/manage/conf/004-persist-conf.md
+++ b/_docs/manage/conf/004-persist-conf.md
@@ -12,7 +12,7 @@ depends on the Drill installation mode.
 The following table provides the persistent storage mode for each of the Drill
 modes:
 
-<table ><tbody><tr><th >Mode</th><th >Description</th></tr><tr><td valign="top" >Embedded</td><td valign="top" >Drill stores persistent data in the local file system. <br />You cannot modify the PStore location for Drill in embedded mode.</td></tr><tr><td valign="top" >Distributed</td><td valign="top" >Drill stores persistent data in ZooKeeper, by default. <br />You can modify where ZooKeeper offloads data, <br />or you can change the persistent storage mode to HBase or MapR-DB.</td></tr></tbody></table>
+<table ><tbody><tr><th >Mode</th><th >Description</th></tr><tr><td valign="top" >Embedded</td><td valign="top" >Drill stores persistent data in the local file system. <br />You cannot modify the PStore location for Drill in embedded mode.</td></tr><tr><td valign="top" >Distributed</td><td valign="top" >Drill stores persistent data in ZooKeeper, by default. <br />You can modify where ZooKeeper offloads data, <br />or you can change the persistent storage mode to HBase or MapR-DB.</td></tr></tbody></table></div>
   
 **Note:** Switching between storage modes does not migrate configuration data.
 

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/001-get-started.md
----------------------------------------------------------------------
diff --git a/_docs/query/001-get-started.md b/_docs/query/001-get-started.md
new file mode 100644
index 0000000..92e924d
--- /dev/null
+++ b/_docs/query/001-get-started.md
@@ -0,0 +1,75 @@
+---
+title: "Getting Started Tutorial"
+parent: "Query Data"
+---
+
+## Goal
+
+This tutorial covers how to query a file and a directory on your local file
+system. Files and directories are like standard SQL tables to Drill. If you
+install Drill in [embedded
+mode](/drill/docs/installing-drill-in-embedded-mode), the
+installer registers and configures your file system as the `dfs` instance.
+You can query these types of files using the default `dfs` storage plugin:
+
+  * Plain text files, such as comma-separated values (CSV) or tab-separated values (TSV) files
+  * JSON files
+  * Parquet files
+
+In this tutorial, you query plain text files using the `dfs` storage plugin. You also create a custom storage
+plugin to simplify querying plain text files.
+
+## Prerequisites
+
+This tutorial assumes that you installed Drill in [embedded
+mode](/drill/docs/installing-drill-in-embedded-mode). The first few lessons of the tutorial
+use a Google file of Ngram data that you download from the internet. The
+compressed Google Ngram files are 8 and 58MB. To expand the compressed files,
+you need an additional 448MB of free disk space for this exercise.
+
+To get started, use the SQLLine command to start the Drill command line
+interface (CLI) on Linux, Mac OS X, or Windows.
+
+### Start Drill (Linux or Mac OS X)
+
+To [start Drill](/drill/docs/starting-stopping-drill) on Linux
+or Mac OS X, use the SQLLine command.
+
+  1. Open a terminal.
+  2. Navigate to the Drill installation directory.
+  
+     Example: `$ cd ~/apache-drill-<version>`
+  3. Issue the following command:
+  
+        $ bin/sqlline -u jdbc:drill:zk=local
+     The Drill prompt appears: `0: jdbc:drill:zk=local`
+
+### Start Drill (Windows)
+
+To [start Drill](/drill/docs/starting-stopping-drill) on
+Windows, use the SQLLine command.
+
+  1. Open the `apache-drill-<version>` folder.
+  2. Open the `bin` folder, and double-click on the `sqlline.bat` file. The Windows command prompt opens.
+  3. At the `sqlline>` prompt, issue the following command, and then press **Enter**:
+  
+        !connect jdbc:drill:zk=local  
+     The following prompt appears: `0: jdbc:drill:zk=local`
+
+### Stop Drill
+
+To stop Drill, issue the following command at the Drill prompt.
+
+        0: jdbc:drill:zk=local> !quit
+
+In some cases, such as stopping while a query is in progress, this command does not stop Drill. You need to kill the Drill process. For example, on Mac OS X and Linux, follow
+these steps:
+
+  1. Issue a CTRL Z to stop the query, then start Drill again. If the startup message indicates success, skip the rest of the steps. If not, proceed to step 2.
+  2. Search for the Drill process ID.
+  
+        $ ps auwx | grep drill
+  3. Kill the process using the process number in the grep output. For example:
+
+        $ sudo kill -9 2674
+

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/001-query-fs.md
----------------------------------------------------------------------
diff --git a/_docs/query/001-query-fs.md b/_docs/query/001-query-fs.md
deleted file mode 100644
index ca488fb..0000000
--- a/_docs/query/001-query-fs.md
+++ /dev/null
@@ -1,35 +0,0 @@
----
-title: "Querying a File System"
-parent: "Query Data"
----
-Files and directories are like standard SQL tables to Drill. You can specify a
-file system "database" as a prefix in queries when you refer to objects across
-databases. In Drill, a file system database consists of a storage plugin name
-followed by an optional workspace name, for example <storage
-plugin>.<workspace> or hdfs.logs.
-
-The following example shows a query on a file system database in a Hadoop
-distributed file system:
-
-       SELECT * FROM hdfs.logs.`AppServerLogs/20104/Jan/01/part0001.txt`;
-
-The default `dfs` storage plugin instance registered with Drill has a
-`default` workspace. If you query data in the `default` workspace, you do not
-need to include the workspace in the query. Refer to
-[Workspaces](/drill/docs/workspaces) for
-more information.
-
-Drill supports the following file types:
-
-  * Plain text files, including:
-    * Comma-separated values (CSV, type: text)
-    * Tab-separated values (TSV, type: text)
-    * Pipe-separated values (PSV, type: text)
-  * Structured data files:
-    * JSON (type: json)
-    * Parquet (type: parquet)
-
-The extensions for these file types must match the configuration settings for
-your registered storage plugins. For example, PSV files may be defined with a
-`.tbl` extension, while CSV files are defined with a `.csv` extension.
-

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/002-query-fs.md
----------------------------------------------------------------------
diff --git a/_docs/query/002-query-fs.md b/_docs/query/002-query-fs.md
new file mode 100644
index 0000000..ca488fb
--- /dev/null
+++ b/_docs/query/002-query-fs.md
@@ -0,0 +1,35 @@
+---
+title: "Querying a File System"
+parent: "Query Data"
+---
+Files and directories are like standard SQL tables to Drill. You can specify a
+file system "database" as a prefix in queries when you refer to objects across
+databases. In Drill, a file system database consists of a storage plugin name
+followed by an optional workspace name, for example <storage
+plugin>.<workspace> or hdfs.logs.
+
+The following example shows a query on a file system database in a Hadoop
+distributed file system:
+
+       SELECT * FROM hdfs.logs.`AppServerLogs/20104/Jan/01/part0001.txt`;
+
+The default `dfs` storage plugin instance registered with Drill has a
+`default` workspace. If you query data in the `default` workspace, you do not
+need to include the workspace in the query. Refer to
+[Workspaces](/drill/docs/workspaces) for
+more information.
+
+Drill supports the following file types:
+
+  * Plain text files, including:
+    * Comma-separated values (CSV, type: text)
+    * Tab-separated values (TSV, type: text)
+    * Pipe-separated values (PSV, type: text)
+  * Structured data files:
+    * JSON (type: json)
+    * Parquet (type: parquet)
+
+The extensions for these file types must match the configuration settings for
+your registered storage plugins. For example, PSV files may be defined with a
+`.tbl` extension, while CSV files are defined with a `.csv` extension.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/002-query-hbase.md
----------------------------------------------------------------------
diff --git a/_docs/query/002-query-hbase.md b/_docs/query/002-query-hbase.md
deleted file mode 100644
index d2a33d5..0000000
--- a/_docs/query/002-query-hbase.md
+++ /dev/null
@@ -1,151 +0,0 @@
----
-title: "Querying HBase"
-parent: "Query Data"
----
-This is a simple exercise that provides steps for creating a “students” table
-and a “clicks” table in HBase that you can query with Drill.
-
-To create the HBase tables and query them with Drill, complete the following
-steps:
-
-  1. Issue the following command to start the HBase shell:
-  
-        hbase shell
-  2. Issue the following commands to create a ‘students’ table and a ‘clicks’ table with column families in HBase:
-    
-        echo "create 'students','account','address'" | hbase shell
-    
-        echo "create 'clicks','clickinfo','iteminfo'" | hbase shell
-  3. Issue the following command with the provided data to create a `testdata.txt` file:
-
-        cat > testdata.txt
-
-     **Sample Data**
-
-        put 'students','student1','account:name','Alice'
-        put 'students','student1','address:street','123 Ballmer Av'
-        put 'students','student1','address:zipcode','12345'
-        put 'students','student1','address:state','CA'
-        put 'students','student2','account:name','Bob'
-        put 'students','student2','address:street','1 Infinite Loop'
-        put 'students','student2','address:zipcode','12345'
-        put 'students','student2','address:state','CA'
-        put 'students','student3','account:name','Frank'
-        put 'students','student3','address:street','435 Walker Ct'
-        put 'students','student3','address:zipcode','12345'
-        put 'students','student3','address:state','CA'
-        put 'students','student4','account:name','Mary'
-        put 'students','student4','address:street','56 Southern Pkwy'
-        put 'students','student4','address:zipcode','12345'
-        put 'students','student4','address:state','CA'
-        put 'clicks','click1','clickinfo:studentid','student1'
-        put 'clicks','click1','clickinfo:url','http://www.google.com'
-        put 'clicks','click1','clickinfo:time','2014-01-01 12:01:01.0001'
-        put 'clicks','click1','iteminfo:itemtype','image'
-        put 'clicks','click1','iteminfo:quantity','1'
-        put 'clicks','click2','clickinfo:studentid','student1'
-        put 'clicks','click2','clickinfo:url','http://www.amazon.com'
-        put 'clicks','click2','clickinfo:time','2014-01-01 01:01:01.0001'
-        put 'clicks','click2','iteminfo:itemtype','image'
-        put 'clicks','click2','iteminfo:quantity','1'
-        put 'clicks','click3','clickinfo:studentid','student2'
-        put 'clicks','click3','clickinfo:url','http://www.google.com'
-        put 'clicks','click3','clickinfo:time','2014-01-01 01:02:01.0001'
-        put 'clicks','click3','iteminfo:itemtype','text'
-        put 'clicks','click3','iteminfo:quantity','2'
-        put 'clicks','click4','clickinfo:studentid','student2'
-        put 'clicks','click4','clickinfo:url','http://www.ask.com'
-        put 'clicks','click4','clickinfo:time','2013-02-01 12:01:01.0001'
-        put 'clicks','click4','iteminfo:itemtype','text'
-        put 'clicks','click4','iteminfo:quantity','5'
-        put 'clicks','click5','clickinfo:studentid','student2'
-        put 'clicks','click5','clickinfo:url','http://www.reuters.com'
-        put 'clicks','click5','clickinfo:time','2013-02-01 12:01:01.0001'
-        put 'clicks','click5','iteminfo:itemtype','text'
-        put 'clicks','click5','iteminfo:quantity','100'
-        put 'clicks','click6','clickinfo:studentid','student3'
-        put 'clicks','click6','clickinfo:url','http://www.google.com'
-        put 'clicks','click6','clickinfo:time','2013-02-01 12:01:01.0001'
-        put 'clicks','click6','iteminfo:itemtype','image'
-        put 'clicks','click6','iteminfo:quantity','1'
-        put 'clicks','click7','clickinfo:studentid','student3'
-        put 'clicks','click7','clickinfo:url','http://www.ask.com'
-        put 'clicks','click7','clickinfo:time','2013-02-01 12:45:01.0001'
-        put 'clicks','click7','iteminfo:itemtype','image'
-        put 'clicks','click7','iteminfo:quantity','10'
-        put 'clicks','click8','clickinfo:studentid','student4'
-        put 'clicks','click8','clickinfo:url','http://www.amazon.com'
-        put 'clicks','click8','clickinfo:time','2013-02-01 22:01:01.0001'
-        put 'clicks','click8','iteminfo:itemtype','image'
-        put 'clicks','click8','iteminfo:quantity','1'
-        put 'clicks','click9','clickinfo:studentid','student4'
-        put 'clicks','click9','clickinfo:url','http://www.amazon.com'
-        put 'clicks','click9','clickinfo:time','2013-02-01 22:01:01.0001'
-        put 'clicks','click9','iteminfo:itemtype','image'
-        put 'clicks','click9','iteminfo:quantity','10'
-
-  4. Issue the following command to verify that the data is in the `testdata.txt` file:  
-    
-         cat testdata.txt | hbase shell
-  5. Issue `exit` to leave the `hbase shell`.
-  6. Start Drill. Refer to [Starting/Stopping Drill](/drill/docs/starting-stopping-drill) for instructions.
-  7. Use Drill to issue the following SQL queries on the “students” and “clicks” tables:  
-  
-     1. Issue the following query to see the data in the “students” table:  
-
-            SELECT * FROM hbase.`students`;
-        The query returns binary results:
-        
-            Query finished, fetching results ...
-            +----------+----------+----------+-----------+----------+----------+----------+-----------+
-            |id    | name        | state       | street      | zipcode |`
-            +----------+----------+----------+-----------+----------+-----------+----------+-----------
-            | [B@1ee37126 | [B@661985a1 | [B@15944165 | [B@385158f4 |[B@3e08d131 |
-            | [B@64a7180e | [B@161c72c2 | [B@25b229e5 | [B@53dc8cb8 |[B@1d11c878 |
-            | [B@349aaf0b | [B@175a1628 | [B@1b64a812 | [B@6d5643ca |[B@147db06f |
-            | [B@3a7cbada | [B@52cf5c35 | [B@2baec60c | [B@5f4c543b |[B@2ec515d6 |
-
-        Since Drill does not require metadata, you must use the SQL `CAST` function in
-some queries to get readable query results.
-
-     2. Issue the following query, that includes the `CAST` function, to see the data in the “`students`” table:
-
-            SELECT CAST(students.clickinfo.studentid as VarChar(20)),
-            CAST(students.account.name as VarChar(20)), CAST (students.address.state as
-            VarChar(20)), CAST (students.address.street as VarChar(20)), CAST
-            (students.address.zipcode as VarChar(20)), FROM hbase.students;
-
-        **Note:** Use the following format when you query a column in an HBase table:
-          
-             tablename.columnfamilyname.columnname
-            
-        For more information about column families, refer to [5.6. Column
-Family](http://hbase.apache.org/book/columnfamily.html).
-
-        The query returns the data:
-
-            Query finished, fetching results ...
-            +----------+-------+-------+------------------+---------+`
-            | studentid | name  | state | street           | zipcode |`
-            +----------+-------+-------+------------------+---------+`
-            | student1 | Alice | CA    | 123 Ballmer Av   | 12345   |`
-            | student2 | Bob   | CA    | 1 Infinite Loop  | 12345   |`
-            | student3 | Frank | CA    | 435 Walker Ct    | 12345   |`
-            | student4 | Mary  | CA    | 56 Southern Pkwy | 12345   |`
-            +----------+-------+-------+------------------+---------+`
-
-     3. Issue the following query on the “clicks” table to find out which students clicked on google.com:
-        
-              SELECT CAST(clicks.clickinfo.studentid as VarChar(200)), CAST(clicks.clickinfo.url as VarChar(200)) FROM hbase.`clicks` WHERE URL LIKE '%google%';  
-
-        The query returns the data:
-        
-            Query finished, fetching results ...`
-        
-            +---------+-----------+-------------------------------+-----------------------+----------+----------+
-            | clickid | studentid | time                          | url                   | itemtype | quantity |
-            +---------+-----------+-------------------------------+-----------------------+----------+----------+
-            | click1  | student1  | 2014-01-01 12:01:01.000100000 | http://www.google.com | image    | 1        |
-            | click3  | student2  | 2014-01-01 01:02:01.000100000 | http://www.google.com | text     | 2        |
-            | click6  | student3  | 2013-02-01 12:01:01.000100000 | http://www.google.com | image    | 1        |
-            +---------+-----------+-------------------------------+-----------------------+----------+----------+
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/003-query-complex.md
----------------------------------------------------------------------
diff --git a/_docs/query/003-query-complex.md b/_docs/query/003-query-complex.md
deleted file mode 100644
index 537d7b4..0000000
--- a/_docs/query/003-query-complex.md
+++ /dev/null
@@ -1,56 +0,0 @@
----
-title: "Querying Complex Data"
-parent: "Query Data"
----
-Apache Drill queries do not require prior knowledge of the actual data you are
-trying to access, regardless of its source system or its schema and data
-types. The sweet spot for Apache Drill is a SQL query workload against
-"complex data": data made up of various types of records and fields, rather
-than data in a recognizable relational form (discrete rows and columns). Drill
-is capable of discovering the form of the data when you submit the query.
-Nested data formats such as JSON (JavaScript Object Notation) files and
-Parquet files are not only _accessible_: Drill provides special operators and
-functions that you can use to _drill down _into these files and ask
-interesting analytic questions.
-
-These operators and functions include:
-
-  * References to nested data values
-  * Access to repeating values in arrays and arrays within arrays (array indexes)
-
-The SQL query developer needs to know the data well enough to write queries
-that identify values of interest in the target file. For example, the writer
-needs to know what a record consists of, and its data types, in order to
-reliably request the right "columns" in the select list. Although these data
-values do not manifest themselves as columns in the source file, Drill will
-return them in the result set as if they had the predictable form of columns
-in a table. Drill also optimizes queries by treating the data as "columnar"
-rather than reading and analyzing complete records. (Drill uses similar
-parallel execution and optimization capabilities to commercial columnar MPP
-databases.)
-
-Given a basic knowledge of the input file, the developer needs to know how to
-use the SQL extensions that Drill provides and how to use them to "reach into"
-the nested data. The following examples show how to write both simple queries
-against JSON files and interesting queries that unpack the nested data. The
-examples show how to use the Drill extensions in the context of standard SQL
-SELECT statements. For the most part, the extensions use standard JavaScript
-notation for referencing data elements in a hierarchy.
-
-### Before You Begin
-
-The examples in this section operate on JSON data files. In order to write
-your own queries, you need to be aware of the basic data types in these files:
-
-  * string (all data inside double quotes), such as `"0001"` or `"Cake"`
-  * numeric types: integers, decimals, and floats, such as `0.55` or `10`
-  * null values
-  * boolean values: true, false
-
-Check that you have the following configuration setting for JSON files in the
-Drill Web UI (`dfs` storage plugin configuration):
-
-    "json" : {
-      "type" : "json"
-    }
-

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/003-query-hbase.md
----------------------------------------------------------------------
diff --git a/_docs/query/003-query-hbase.md b/_docs/query/003-query-hbase.md
new file mode 100644
index 0000000..d2a33d5
--- /dev/null
+++ b/_docs/query/003-query-hbase.md
@@ -0,0 +1,151 @@
+---
+title: "Querying HBase"
+parent: "Query Data"
+---
+This is a simple exercise that provides steps for creating a “students” table
+and a “clicks” table in HBase that you can query with Drill.
+
+To create the HBase tables and query them with Drill, complete the following
+steps:
+
+  1. Issue the following command to start the HBase shell:
+  
+        hbase shell
+  2. Issue the following commands to create a ‘students’ table and a ‘clicks’ table with column families in HBase:
+    
+        echo "create 'students','account','address'" | hbase shell
+    
+        echo "create 'clicks','clickinfo','iteminfo'" | hbase shell
+  3. Issue the following command with the provided data to create a `testdata.txt` file:
+
+        cat > testdata.txt
+
+     **Sample Data**
+
+        put 'students','student1','account:name','Alice'
+        put 'students','student1','address:street','123 Ballmer Av'
+        put 'students','student1','address:zipcode','12345'
+        put 'students','student1','address:state','CA'
+        put 'students','student2','account:name','Bob'
+        put 'students','student2','address:street','1 Infinite Loop'
+        put 'students','student2','address:zipcode','12345'
+        put 'students','student2','address:state','CA'
+        put 'students','student3','account:name','Frank'
+        put 'students','student3','address:street','435 Walker Ct'
+        put 'students','student3','address:zipcode','12345'
+        put 'students','student3','address:state','CA'
+        put 'students','student4','account:name','Mary'
+        put 'students','student4','address:street','56 Southern Pkwy'
+        put 'students','student4','address:zipcode','12345'
+        put 'students','student4','address:state','CA'
+        put 'clicks','click1','clickinfo:studentid','student1'
+        put 'clicks','click1','clickinfo:url','http://www.google.com'
+        put 'clicks','click1','clickinfo:time','2014-01-01 12:01:01.0001'
+        put 'clicks','click1','iteminfo:itemtype','image'
+        put 'clicks','click1','iteminfo:quantity','1'
+        put 'clicks','click2','clickinfo:studentid','student1'
+        put 'clicks','click2','clickinfo:url','http://www.amazon.com'
+        put 'clicks','click2','clickinfo:time','2014-01-01 01:01:01.0001'
+        put 'clicks','click2','iteminfo:itemtype','image'
+        put 'clicks','click2','iteminfo:quantity','1'
+        put 'clicks','click3','clickinfo:studentid','student2'
+        put 'clicks','click3','clickinfo:url','http://www.google.com'
+        put 'clicks','click3','clickinfo:time','2014-01-01 01:02:01.0001'
+        put 'clicks','click3','iteminfo:itemtype','text'
+        put 'clicks','click3','iteminfo:quantity','2'
+        put 'clicks','click4','clickinfo:studentid','student2'
+        put 'clicks','click4','clickinfo:url','http://www.ask.com'
+        put 'clicks','click4','clickinfo:time','2013-02-01 12:01:01.0001'
+        put 'clicks','click4','iteminfo:itemtype','text'
+        put 'clicks','click4','iteminfo:quantity','5'
+        put 'clicks','click5','clickinfo:studentid','student2'
+        put 'clicks','click5','clickinfo:url','http://www.reuters.com'
+        put 'clicks','click5','clickinfo:time','2013-02-01 12:01:01.0001'
+        put 'clicks','click5','iteminfo:itemtype','text'
+        put 'clicks','click5','iteminfo:quantity','100'
+        put 'clicks','click6','clickinfo:studentid','student3'
+        put 'clicks','click6','clickinfo:url','http://www.google.com'
+        put 'clicks','click6','clickinfo:time','2013-02-01 12:01:01.0001'
+        put 'clicks','click6','iteminfo:itemtype','image'
+        put 'clicks','click6','iteminfo:quantity','1'
+        put 'clicks','click7','clickinfo:studentid','student3'
+        put 'clicks','click7','clickinfo:url','http://www.ask.com'
+        put 'clicks','click7','clickinfo:time','2013-02-01 12:45:01.0001'
+        put 'clicks','click7','iteminfo:itemtype','image'
+        put 'clicks','click7','iteminfo:quantity','10'
+        put 'clicks','click8','clickinfo:studentid','student4'
+        put 'clicks','click8','clickinfo:url','http://www.amazon.com'
+        put 'clicks','click8','clickinfo:time','2013-02-01 22:01:01.0001'
+        put 'clicks','click8','iteminfo:itemtype','image'
+        put 'clicks','click8','iteminfo:quantity','1'
+        put 'clicks','click9','clickinfo:studentid','student4'
+        put 'clicks','click9','clickinfo:url','http://www.amazon.com'
+        put 'clicks','click9','clickinfo:time','2013-02-01 22:01:01.0001'
+        put 'clicks','click9','iteminfo:itemtype','image'
+        put 'clicks','click9','iteminfo:quantity','10'
+
+  4. Issue the following command to verify that the data is in the `testdata.txt` file:  
+    
+         cat testdata.txt | hbase shell
+  5. Issue `exit` to leave the `hbase shell`.
+  6. Start Drill. Refer to [Starting/Stopping Drill](/drill/docs/starting-stopping-drill) for instructions.
+  7. Use Drill to issue the following SQL queries on the “students” and “clicks” tables:  
+  
+     1. Issue the following query to see the data in the “students” table:  
+
+            SELECT * FROM hbase.`students`;
+        The query returns binary results:
+        
+            Query finished, fetching results ...
+            +----------+----------+----------+-----------+----------+----------+----------+-----------+
+            |id    | name        | state       | street      | zipcode |`
+            +----------+----------+----------+-----------+----------+-----------+----------+-----------
+            | [B@1ee37126 | [B@661985a1 | [B@15944165 | [B@385158f4 |[B@3e08d131 |
+            | [B@64a7180e | [B@161c72c2 | [B@25b229e5 | [B@53dc8cb8 |[B@1d11c878 |
+            | [B@349aaf0b | [B@175a1628 | [B@1b64a812 | [B@6d5643ca |[B@147db06f |
+            | [B@3a7cbada | [B@52cf5c35 | [B@2baec60c | [B@5f4c543b |[B@2ec515d6 |
+
+        Since Drill does not require metadata, you must use the SQL `CAST` function in
+some queries to get readable query results.
+
+     2. Issue the following query, that includes the `CAST` function, to see the data in the “`students`” table:
+
+            SELECT CAST(students.clickinfo.studentid as VarChar(20)),
+            CAST(students.account.name as VarChar(20)), CAST (students.address.state as
+            VarChar(20)), CAST (students.address.street as VarChar(20)), CAST
+            (students.address.zipcode as VarChar(20)), FROM hbase.students;
+
+        **Note:** Use the following format when you query a column in an HBase table:
+          
+             tablename.columnfamilyname.columnname
+            
+        For more information about column families, refer to [5.6. Column
+Family](http://hbase.apache.org/book/columnfamily.html).
+
+        The query returns the data:
+
+            Query finished, fetching results ...
+            +----------+-------+-------+------------------+---------+`
+            | studentid | name  | state | street           | zipcode |`
+            +----------+-------+-------+------------------+---------+`
+            | student1 | Alice | CA    | 123 Ballmer Av   | 12345   |`
+            | student2 | Bob   | CA    | 1 Infinite Loop  | 12345   |`
+            | student3 | Frank | CA    | 435 Walker Ct    | 12345   |`
+            | student4 | Mary  | CA    | 56 Southern Pkwy | 12345   |`
+            +----------+-------+-------+------------------+---------+`
+
+     3. Issue the following query on the “clicks” table to find out which students clicked on google.com:
+        
+              SELECT CAST(clicks.clickinfo.studentid as VarChar(200)), CAST(clicks.clickinfo.url as VarChar(200)) FROM hbase.`clicks` WHERE URL LIKE '%google%';  
+
+        The query returns the data:
+        
+            Query finished, fetching results ...`
+        
+            +---------+-----------+-------------------------------+-----------------------+----------+----------+
+            | clickid | studentid | time                          | url                   | itemtype | quantity |
+            +---------+-----------+-------------------------------+-----------------------+----------+----------+
+            | click1  | student1  | 2014-01-01 12:01:01.000100000 | http://www.google.com | image    | 1        |
+            | click3  | student2  | 2014-01-01 01:02:01.000100000 | http://www.google.com | text     | 2        |
+            | click6  | student3  | 2013-02-01 12:01:01.000100000 | http://www.google.com | image    | 1        |
+            +---------+-----------+-------------------------------+-----------------------+----------+----------+
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/004-query-complex.md
----------------------------------------------------------------------
diff --git a/_docs/query/004-query-complex.md b/_docs/query/004-query-complex.md
new file mode 100644
index 0000000..537d7b4
--- /dev/null
+++ b/_docs/query/004-query-complex.md
@@ -0,0 +1,56 @@
+---
+title: "Querying Complex Data"
+parent: "Query Data"
+---
+Apache Drill queries do not require prior knowledge of the actual data you are
+trying to access, regardless of its source system or its schema and data
+types. The sweet spot for Apache Drill is a SQL query workload against
+"complex data": data made up of various types of records and fields, rather
+than data in a recognizable relational form (discrete rows and columns). Drill
+is capable of discovering the form of the data when you submit the query.
+Nested data formats such as JSON (JavaScript Object Notation) files and
+Parquet files are not only _accessible_: Drill provides special operators and
+functions that you can use to _drill down _into these files and ask
+interesting analytic questions.
+
+These operators and functions include:
+
+  * References to nested data values
+  * Access to repeating values in arrays and arrays within arrays (array indexes)
+
+The SQL query developer needs to know the data well enough to write queries
+that identify values of interest in the target file. For example, the writer
+needs to know what a record consists of, and its data types, in order to
+reliably request the right "columns" in the select list. Although these data
+values do not manifest themselves as columns in the source file, Drill will
+return them in the result set as if they had the predictable form of columns
+in a table. Drill also optimizes queries by treating the data as "columnar"
+rather than reading and analyzing complete records. (Drill uses similar
+parallel execution and optimization capabilities to commercial columnar MPP
+databases.)
+
+Given a basic knowledge of the input file, the developer needs to know how to
+use the SQL extensions that Drill provides and how to use them to "reach into"
+the nested data. The following examples show how to write both simple queries
+against JSON files and interesting queries that unpack the nested data. The
+examples show how to use the Drill extensions in the context of standard SQL
+SELECT statements. For the most part, the extensions use standard JavaScript
+notation for referencing data elements in a hierarchy.
+
+### Before You Begin
+
+The examples in this section operate on JSON data files. In order to write
+your own queries, you need to be aware of the basic data types in these files:
+
+  * string (all data inside double quotes), such as `"0001"` or `"Cake"`
+  * numeric types: integers, decimals, and floats, such as `0.55` or `10`
+  * null values
+  * boolean values: true, false
+
+Check that you have the following configuration setting for JSON files in the
+Drill Web UI (`dfs` storage plugin configuration):
+
+    "json" : {
+      "type" : "json"
+    }
+

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/004-query-hive.md
----------------------------------------------------------------------
diff --git a/_docs/query/004-query-hive.md b/_docs/query/004-query-hive.md
deleted file mode 100644
index 903c7c6..0000000
--- a/_docs/query/004-query-hive.md
+++ /dev/null
@@ -1,45 +0,0 @@
----
-title: "Querying Hive"
-parent: "Query Data"
----
-This is a simple exercise that provides steps for creating a Hive table and
-inserting data that you can query using Drill. Before you perform the steps,
-download the [customers.csv](http://doc.mapr.com/download/attachments/22906623
-/customers.csv?api=v2) file.
-
-To create a Hive table and query it with Drill, complete the following steps:
-
-  1. Issue the following command to start the Hive shell:
-  
-        hive
-  2. Issue the following command from the Hive shell create a table schema:
-  
-        hive> create table customers(FirstName string, LastName string, Company string, Address string, City string, County string, State string, Zip string, Phone string, Fax string, Email string, Web string) row format delimited fields terminated by ',' stored as textfile;
-  3. Issue the following command to load the customer data into the customers table:  
-
-        hive> load data local inpath '/<directory path>/customers.csv' overwrite into table customers;`
-  4. Issue `quit` or `exit` to leave the Hive shell.
-  5. Start Drill. Refer to [Starting/Stopping Drill](/drill/docs/starting-stopping-drill) for instructions.
-  6. Issue the following query to Drill to get the first and last names of the first ten customers in the Hive table:  
-
-        0: jdbc:drill:schema=hiveremote> SELECT firstname,lastname FROM hiveremote.`customers` limit 10;`
-
-     The query returns the following results:
-     
-        +------------+------------+
-        | firstname  |  lastname  |
-        +------------+------------+
-        | Essie      | Vaill      |
-        | Cruz       | Roudabush  |
-        | Billie     | Tinnes     |
-        | Zackary    | Mockus     |
-        | Rosemarie  | Fifield    |
-        | Bernard    | Laboy      |
-        | Sue        | Haakinson  |
-        | Valerie    | Pou        |
-        | Lashawn    | Hasty      |
-        | Marianne   | Earman     |
-        +------------+------------+
-        10 rows selected (1.5 seconds)
-        0: jdbc:drill:schema=hiveremote>
-

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/005-query-hive.md
----------------------------------------------------------------------
diff --git a/_docs/query/005-query-hive.md b/_docs/query/005-query-hive.md
new file mode 100644
index 0000000..01be576
--- /dev/null
+++ b/_docs/query/005-query-hive.md
@@ -0,0 +1,45 @@
+---
+title: "Querying Hive"
+parent: "Query Data"
+---
+This is a simple exercise that provides steps for creating a Hive table and
+inserting data that you can query using Drill. Before you perform the steps,
+download the [customers.csv](http://doc.mapr.com/download/attachments/22906623
+/customers.csv?api=v2) file.
+
+To create a Hive table and query it with Drill, complete the following steps:
+
+  1. Issue the following command to start the Hive shell:
+  
+        hive
+  2. Issue the following command from the Hive shell create a table schema:
+  
+        hive> create table customers(FirstName string, LastName string, Company string, Address string, City string, County string, State string, Zip string, Phone string, Fax string, Email string, Web string) row format delimited fields terminated by ',' stored as textfile;
+  3. Issue the following command to load the customer data into the customers table:  
+
+        hive> load data local inpath '/<directory path>/customers.csv' overwrite into table customers;`
+  4. Issue `quit` or `exit` to leave the Hive shell.
+  5. Start Drill. Refer to [/drill/docs/starting-stopping-drill) for instructions.
+  6. Issue the following query to Drill to get the first and last names of the first ten customers in the Hive table:  
+
+        0: jdbc:drill:schema=hiveremote> SELECT firstname,lastname FROM hiveremote.`customers` limit 10;`
+
+     The query returns the following results:
+     
+        +------------+------------+
+        | firstname  |  lastname  |
+        +------------+------------+
+        | Essie      | Vaill      |
+        | Cruz       | Roudabush  |
+        | Billie     | Tinnes     |
+        | Zackary    | Mockus     |
+        | Rosemarie  | Fifield    |
+        | Bernard    | Laboy      |
+        | Sue        | Haakinson  |
+        | Valerie    | Pou        |
+        | Lashawn    | Hasty      |
+        | Marianne   | Earman     |
+        +------------+------------+
+        10 rows selected (1.5 seconds)
+        0: jdbc:drill:schema=hiveremote>
+

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/005-query-info-skema.md
----------------------------------------------------------------------
diff --git a/_docs/query/005-query-info-skema.md b/_docs/query/005-query-info-skema.md
deleted file mode 100644
index 1ad0008..0000000
--- a/_docs/query/005-query-info-skema.md
+++ /dev/null
@@ -1,109 +0,0 @@
----
-title: "Querying the INFORMATION SCHEMA"
-parent: "Query Data"
----
-When you are using Drill to connect to multiple data sources, you need a
-simple mechanism to discover what each data source contains. The information
-schema is an ANSI standard set of metadata tables that you can query to return
-information about all of your Drill data sources (or schemas). Data sources
-may be databases or file systems; they are all known as "schemas" in this
-context. You can query the following INFORMATION_SCHEMA tables:
-
-  * SCHEMATA
-  * CATALOGS
-  * TABLES
-  * COLUMNS 
-  * VIEWS
-
-## SCHEMATA
-
-The SCHEMATA table contains the CATALOG_NAME and SCHEMA_NAME columns. To allow
-maximum flexibility inside BI tools, the only catalog that Drill supports is
-`DRILL`.
-
-    0: jdbc:drill:zk=local> select CATALOG_NAME, SCHEMA_NAME as all_my_data_sources from INFORMATION_SCHEMA.SCHEMATA order by SCHEMA_NAME;
-    +--------------+---------------------+
-    | CATALOG_NAME | all_my_data_sources |
-    +--------------+---------------------+
-    | DRILL        | INFORMATION_SCHEMA  |
-    | DRILL        | cp.default          |
-    | DRILL        | dfs.default         |
-    | DRILL        | dfs.root            |
-    | DRILL        | dfs.tmp             |
-    | DRILL        | HiveTest.SalesDB    |
-    | DRILL        | maprfs.logs         |
-    | DRILL        | sys                 |
-    +--------------+---------------------+
-
-The INFORMATION_SCHEMA name and associated keywords are case-sensitive. You
-can also return a list of schemas by running the SHOW DATABASES command:
-
-    0: jdbc:drill:zk=local> show databases;
-    +-------------+
-    | SCHEMA_NAME |
-    +-------------+
-    | dfs.default |
-    | dfs.root    |
-    | dfs.tmp     |
-    ...
-
-## CATALOGS
-
-The CATALOGS table returns only one row, with the hardcoded DRILL catalog name
-and description.
-
-## TABLES
-
-The TABLES table returns the table name and type for each table or view in
-your databases. (Type means TABLE or VIEW.) Note that Drill does not return
-files available for querying in file-based data sources. Instead, use SHOW
-FILES to explore these data sources.
-
-## COLUMNS
-
-The COLUMNS table returns the column name and other metadata (such as the data
-type) for each column in each table or view.
-
-## VIEWS
-
-The VIEWS table returns the name and definition for each view in your
-databases. Note that file schemas are the canonical repository for views in
-Drill. Depending on how you create a view, the may only be displayed in Drill
-after it has been used.
-
-## Useful Queries
-
-Run an ``INFORMATION_SCHEMA.`TABLES` ``query to view all of the tables and views
-within a database. TABLES is a reserved word in Drill and requires back ticks
-(`).
-
-For example, the following query identifies all of the tables and views that
-Drill can access:
-
-    SELECT TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE
-    FROM INFORMATION_SCHEMA.`TABLES`
-    ORDER BY TABLE_NAME DESC;
-    ----------------------------------------------------------------
-    TABLE_SCHEMA             TABLE_NAME            TABLE_TYPE
-    ----------------------------------------------------------------
-    HiveTest.CustomersDB     Customers             TABLE
-    HiveTest.SalesDB         Orders                TABLE
-    HiveTest.SalesDB         OrderLines            TABLE
-    HiveTest.SalesDB         USOrders              VIEW
-    dfs.default              CustomerSocialProfile VIEW
-    ----------------------------------------------------------------
-
-**Note:** Currently, Drill only supports querying Drill views; Hive views are not yet supported.
-
-You can run a similar query to identify columns in tables and the data types
-of those columns:
-
-    SELECT COLUMN_NAME, DATA_TYPE 
-    FROM INFORMATION_SCHEMA.COLUMNS 
-    WHERE TABLE_NAME = 'Orders' AND TABLE_SCHEMA = 'HiveTest.SalesDB' AND COLUMN_NAME LIKE '%Total';
-    +-------------+------------+
-    | COLUMN_NAME | DATA_TYPE  |
-    +-------------+------------+
-    | OrderTotal  | Decimal    |
-    +-------------+------------+
-

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/006-query-info-skema.md
----------------------------------------------------------------------
diff --git a/_docs/query/006-query-info-skema.md b/_docs/query/006-query-info-skema.md
new file mode 100644
index 0000000..1ad0008
--- /dev/null
+++ b/_docs/query/006-query-info-skema.md
@@ -0,0 +1,109 @@
+---
+title: "Querying the INFORMATION SCHEMA"
+parent: "Query Data"
+---
+When you are using Drill to connect to multiple data sources, you need a
+simple mechanism to discover what each data source contains. The information
+schema is an ANSI standard set of metadata tables that you can query to return
+information about all of your Drill data sources (or schemas). Data sources
+may be databases or file systems; they are all known as "schemas" in this
+context. You can query the following INFORMATION_SCHEMA tables:
+
+  * SCHEMATA
+  * CATALOGS
+  * TABLES
+  * COLUMNS 
+  * VIEWS
+
+## SCHEMATA
+
+The SCHEMATA table contains the CATALOG_NAME and SCHEMA_NAME columns. To allow
+maximum flexibility inside BI tools, the only catalog that Drill supports is
+`DRILL`.
+
+    0: jdbc:drill:zk=local> select CATALOG_NAME, SCHEMA_NAME as all_my_data_sources from INFORMATION_SCHEMA.SCHEMATA order by SCHEMA_NAME;
+    +--------------+---------------------+
+    | CATALOG_NAME | all_my_data_sources |
+    +--------------+---------------------+
+    | DRILL        | INFORMATION_SCHEMA  |
+    | DRILL        | cp.default          |
+    | DRILL        | dfs.default         |
+    | DRILL        | dfs.root            |
+    | DRILL        | dfs.tmp             |
+    | DRILL        | HiveTest.SalesDB    |
+    | DRILL        | maprfs.logs         |
+    | DRILL        | sys                 |
+    +--------------+---------------------+
+
+The INFORMATION_SCHEMA name and associated keywords are case-sensitive. You
+can also return a list of schemas by running the SHOW DATABASES command:
+
+    0: jdbc:drill:zk=local> show databases;
+    +-------------+
+    | SCHEMA_NAME |
+    +-------------+
+    | dfs.default |
+    | dfs.root    |
+    | dfs.tmp     |
+    ...
+
+## CATALOGS
+
+The CATALOGS table returns only one row, with the hardcoded DRILL catalog name
+and description.
+
+## TABLES
+
+The TABLES table returns the table name and type for each table or view in
+your databases. (Type means TABLE or VIEW.) Note that Drill does not return
+files available for querying in file-based data sources. Instead, use SHOW
+FILES to explore these data sources.
+
+## COLUMNS
+
+The COLUMNS table returns the column name and other metadata (such as the data
+type) for each column in each table or view.
+
+## VIEWS
+
+The VIEWS table returns the name and definition for each view in your
+databases. Note that file schemas are the canonical repository for views in
+Drill. Depending on how you create a view, the may only be displayed in Drill
+after it has been used.
+
+## Useful Queries
+
+Run an ``INFORMATION_SCHEMA.`TABLES` ``query to view all of the tables and views
+within a database. TABLES is a reserved word in Drill and requires back ticks
+(`).
+
+For example, the following query identifies all of the tables and views that
+Drill can access:
+
+    SELECT TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE
+    FROM INFORMATION_SCHEMA.`TABLES`
+    ORDER BY TABLE_NAME DESC;
+    ----------------------------------------------------------------
+    TABLE_SCHEMA             TABLE_NAME            TABLE_TYPE
+    ----------------------------------------------------------------
+    HiveTest.CustomersDB     Customers             TABLE
+    HiveTest.SalesDB         Orders                TABLE
+    HiveTest.SalesDB         OrderLines            TABLE
+    HiveTest.SalesDB         USOrders              VIEW
+    dfs.default              CustomerSocialProfile VIEW
+    ----------------------------------------------------------------
+
+**Note:** Currently, Drill only supports querying Drill views; Hive views are not yet supported.
+
+You can run a similar query to identify columns in tables and the data types
+of those columns:
+
+    SELECT COLUMN_NAME, DATA_TYPE 
+    FROM INFORMATION_SCHEMA.COLUMNS 
+    WHERE TABLE_NAME = 'Orders' AND TABLE_SCHEMA = 'HiveTest.SalesDB' AND COLUMN_NAME LIKE '%Total';
+    +-------------+------------+
+    | COLUMN_NAME | DATA_TYPE  |
+    +-------------+------------+
+    | OrderTotal  | Decimal    |
+    +-------------+------------+
+

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/006-query-sys-tbl.md
----------------------------------------------------------------------
diff --git a/_docs/query/006-query-sys-tbl.md b/_docs/query/006-query-sys-tbl.md
deleted file mode 100644
index 9b853ec..0000000
--- a/_docs/query/006-query-sys-tbl.md
+++ /dev/null
@@ -1,159 +0,0 @@
----
-title: "Querying System Tables"
-parent: "Query Data"
----
-Drill has a sys database that contains system tables. You can query the system
-tables for information about Drill, including Drill ports, the Drill version
-running on the system, and available Drill options. View the databases in
-Drill to identify the sys database, and then use the sys database to view
-system tables that you can query.
-
-## View Drill Databases
-
-Issue the `SHOW DATABASES` command to view Drill databases.
-
-    0: jdbc:drill:zk=10.10.100.113:5181> show databases;
-    +-------------+
-    | SCHEMA_NAME |
-    +-------------+
-    | M7          |
-    | hive.default|
-    | dfs.default |
-    | dfs.root    |
-    | dfs.views   |
-    | dfs.tmp     |
-    | dfs.tpcds   |
-    | sys         |
-    | cp.default  |
-    | hbase       |
-    | INFORMATION_SCHEMA |
-    +-------------+
-    11 rows selected (0.162 seconds)
-
-Drill returns `sys` in the database results.
-
-## Use the Sys Database
-
-Issue the `USE` command to select the sys database for subsequent SQL
-requests.
-
-    0: jdbc:drill:zk=10.10.100.113:5181> use sys;
-    +------------+--------------------------------+
-    |   ok     |  summary                         |
-    +------------+--------------------------------+
-    | true     | Default schema changed to 'sys'  |
-    +------------+--------------------------------+
-    1 row selected (0.101 seconds)
-
-## View Tables
-
-Issue the `SHOW TABLES` command to view the tables in the sys database.
-
-    0: jdbc:drill:zk=10.10.100.113:5181> show tables;
-    +--------------+------------+
-    | TABLE_SCHEMA | TABLE_NAME |
-    +--------------+------------+
-    | sys          | drillbits  |
-    | sys          | version    |
-    | sys          | options    |
-    +--------------+------------+
-    3 rows selected (0.934 seconds)
-    0: jdbc:drill:zk=10.10.100.113:5181>
-
-## Query System Tables
-
-Query the drillbits, version, and options tables in the sys database.
-
-###Query the drillbits table.
-
-    0: jdbc:drill:zk=10.10.100.113:5181> select * from drillbits;
-    +------------------+------------+--------------+------------+---------+
-    |   host            | user_port | control_port | data_port  |  current|
-    +-------------------+------------+--------------+------------+--------+
-    | qa-node115.qa.lab | 31010     | 31011        | 31012      | true    |
-    | qa-node114.qa.lab | 31010     | 31011        | 31012      | false   |
-    | qa-node116.qa.lab | 31010     | 31011        | 31012      | false   |
-    +------------+------------+--------------+------------+---------------+
-    3 rows selected (0.146 seconds)
-
-  * host   
-The name of the node running the Drillbit service.
-  * user-port  
-The user port address, used between nodes in a cluster for connecting to
-external clients and for the Drill Web UI.  
-  * control_port  
-The control port address, used between nodes for multi-node installation of
-Apache Drill.
-  * data_port  
-The data port address, used between nodes for multi-node installation of
-Apache Drill.
-  * current  
-True means the Drillbit is connected to the session or client running the
-query. This Drillbit is the Foreman for the current session.  
-
-### Query the version table.
-
-    0: jdbc:drill:zk=10.10.100.113:5181> select * from version;
-    +------------+----------------+-------------+-------------+------------+
-    | commit_id  | commit_message | commit_time | build_email | build_time |
-    +------------+----------------+-------------+-------------+------------+
-    | 108d29fce3d8465d619d45db5f6f433ca3d97619 | DRILL-1635: Additional fix for validation exceptions. | 14.11.2014 @ 02:32:47 UTC | Unknown    | 14.11.2014 @ 03:56:07 UTC |
-    +------------+----------------+-------------+-------------+------------+
-    1 row selected (0.144 seconds)
-  * commit_id  
-The github id of the release you are running. For example, <https://github.com
-/apache/drill/commit/e3ab2c1760ad34bda80141e2c3108f7eda7c9104>
-  * commit_message  
-The message explaining the change.
-  * commit_time  
-The date and time of the change.
-  * build_email  
-The email address of the person who made the change, which is unknown in this
-example.
-  * build_time  
-The time that the release was built.
-
-### Query the options table.
-
-Drill provides system, session, and boot options that you can query.
-
-The following example shows a query on the system options:
-
-    0: jdbc:drill:zk=10.10.100.113:5181> select * from options where type='SYSTEM' limit 10;
-    +------------+------------+------------+------------+------------+------------+------------+
-    |    name   |   kind    |   type    |  num_val   | string_val |  bool_val  | float_val  |
-    +------------+------------+------------+------------+------------+------------+------------+
-    | exec.max_hash_table_size | LONG       | SYSTEM    | 1073741824 | null     | null      | null      |
-    | planner.memory.max_query_memory_per_node | LONG       | SYSTEM    | 2048       | null     | null      | null      |
-    | planner.join.row_count_estimate_factor | DOUBLE   | SYSTEM    | null      | null      | null      | 1.0       |
-    | planner.affinity_factor | DOUBLE  | SYSTEM    | null      | null      | null       | 1.2      |
-    | exec.errors.verbose | BOOLEAN | SYSTEM    | null      | null      | false      | null     |
-    | planner.disable_exchanges | BOOLEAN   | SYSTEM    | null      | null      | false      | null     |
-    | exec.java_compiler_debug | BOOLEAN    | SYSTEM    | null      | null      | true      | null      |
-    | exec.min_hash_table_size | LONG       | SYSTEM    | 65536     | null      | null      | null       |
-    | exec.java_compiler_janino_maxsize | LONG       | SYSTEM   | 262144    | null      | null      | null      |
-    | planner.enable_mergejoin | BOOLEAN    | SYSTEM    | null      | null      | true      | null       |
-    +------------+------------+------------+------------+------------+------------+------------+
-    10 rows selected (0.334 seconds)  
-  * name  
-The name of the option.
-  * kind  
-The data type of the option value.
-  * type  
-The type of options in the output: system, session, or boot.
-  * num_val  
-The default value, which is of the long or int data type; otherwise, null.
-  * string_val  
-The default value, which is a string; otherwise, null.
-  * bool_val  
-The default value, which is true or false; otherwise, null.
-  * float_val  
-The default value, which is of the double, float, or long double data type;
-otherwise, null.
-
-For information about how to configure Drill system and session options, see[
-Planning and Execution Options](/drill/docs/planning-and-execution-options).
-
-For information about how to configure Drill start-up options, see[ Start-Up
-Options](/drill/docs/start-up-options).
-

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/007-query-sys-tbl.md
----------------------------------------------------------------------
diff --git a/_docs/query/007-query-sys-tbl.md b/_docs/query/007-query-sys-tbl.md
new file mode 100644
index 0000000..9b853ec
--- /dev/null
+++ b/_docs/query/007-query-sys-tbl.md
@@ -0,0 +1,159 @@
+---
+title: "Querying System Tables"
+parent: "Query Data"
+---
+Drill has a sys database that contains system tables. You can query the system
+tables for information about Drill, including Drill ports, the Drill version
+running on the system, and available Drill options. View the databases in
+Drill to identify the sys database, and then use the sys database to view
+system tables that you can query.
+
+## View Drill Databases
+
+Issue the `SHOW DATABASES` command to view Drill databases.
+
+    0: jdbc:drill:zk=10.10.100.113:5181> show databases;
+    +-------------+
+    | SCHEMA_NAME |
+    +-------------+
+    | M7          |
+    | hive.default|
+    | dfs.default |
+    | dfs.root    |
+    | dfs.views   |
+    | dfs.tmp     |
+    | dfs.tpcds   |
+    | sys         |
+    | cp.default  |
+    | hbase       |
+    | INFORMATION_SCHEMA |
+    +-------------+
+    11 rows selected (0.162 seconds)
+
+Drill returns `sys` in the database results.
+
+## Use the Sys Database
+
+Issue the `USE` command to select the sys database for subsequent SQL
+requests.
+
+    0: jdbc:drill:zk=10.10.100.113:5181> use sys;
+    +------------+--------------------------------+
+    |   ok     |  summary                         |
+    +------------+--------------------------------+
+    | true     | Default schema changed to 'sys'  |
+    +------------+--------------------------------+
+    1 row selected (0.101 seconds)
+
+## View Tables
+
+Issue the `SHOW TABLES` command to view the tables in the sys database.
+
+    0: jdbc:drill:zk=10.10.100.113:5181> show tables;
+    +--------------+------------+
+    | TABLE_SCHEMA | TABLE_NAME |
+    +--------------+------------+
+    | sys          | drillbits  |
+    | sys          | version    |
+    | sys          | options    |
+    +--------------+------------+
+    3 rows selected (0.934 seconds)
+    0: jdbc:drill:zk=10.10.100.113:5181>
+
+## Query System Tables
+
+Query the drillbits, version, and options tables in the sys database.
+
+###Query the drillbits table.
+
+    0: jdbc:drill:zk=10.10.100.113:5181> select * from drillbits;
+    +------------------+------------+--------------+------------+---------+
+    |   host            | user_port | control_port | data_port  |  current|
+    +-------------------+------------+--------------+------------+--------+
+    | qa-node115.qa.lab | 31010     | 31011        | 31012      | true    |
+    | qa-node114.qa.lab | 31010     | 31011        | 31012      | false   |
+    | qa-node116.qa.lab | 31010     | 31011        | 31012      | false   |
+    +------------+------------+--------------+------------+---------------+
+    3 rows selected (0.146 seconds)
+
+  * host   
+The name of the node running the Drillbit service.
+  * user-port  
+The user port address, used between nodes in a cluster for connecting to
+external clients and for the Drill Web UI.  
+  * control_port  
+The control port address, used between nodes for multi-node installation of
+Apache Drill.
+  * data_port  
+The data port address, used between nodes for multi-node installation of
+Apache Drill.
+  * current  
+True means the Drillbit is connected to the session or client running the
+query. This Drillbit is the Foreman for the current session.  
+
+### Query the version table.
+
+    0: jdbc:drill:zk=10.10.100.113:5181> select * from version;
+    +------------+----------------+-------------+-------------+------------+
+    | commit_id  | commit_message | commit_time | build_email | build_time |
+    +------------+----------------+-------------+-------------+------------+
+    | 108d29fce3d8465d619d45db5f6f433ca3d97619 | DRILL-1635: Additional fix for validation exceptions. | 14.11.2014 @ 02:32:47 UTC | Unknown    | 14.11.2014 @ 03:56:07 UTC |
+    +------------+----------------+-------------+-------------+------------+
+    1 row selected (0.144 seconds)
+  * commit_id  
+The github id of the release you are running. For example, <https://github.com
+/apache/drill/commit/e3ab2c1760ad34bda80141e2c3108f7eda7c9104>
+  * commit_message  
+The message explaining the change.
+  * commit_time  
+The date and time of the change.
+  * build_email  
+The email address of the person who made the change, which is unknown in this
+example.
+  * build_time  
+The time that the release was built.
+
+### Query the options table.
+
+Drill provides system, session, and boot options that you can query.
+
+The following example shows a query on the system options:
+
+    0: jdbc:drill:zk=10.10.100.113:5181> select * from options where type='SYSTEM' limit 10;
+    +------------+------------+------------+------------+------------+------------+------------+
+    |    name   |   kind    |   type    |  num_val   | string_val |  bool_val  | float_val  |
+    +------------+------------+------------+------------+------------+------------+------------+
+    | exec.max_hash_table_size | LONG       | SYSTEM    | 1073741824 | null     | null      | null      |
+    | planner.memory.max_query_memory_per_node | LONG       | SYSTEM    | 2048       | null     | null      | null      |
+    | planner.join.row_count_estimate_factor | DOUBLE   | SYSTEM    | null      | null      | null      | 1.0       |
+    | planner.affinity_factor | DOUBLE  | SYSTEM    | null      | null      | null       | 1.2      |
+    | exec.errors.verbose | BOOLEAN | SYSTEM    | null      | null      | false      | null     |
+    | planner.disable_exchanges | BOOLEAN   | SYSTEM    | null      | null      | false      | null     |
+    | exec.java_compiler_debug | BOOLEAN    | SYSTEM    | null      | null      | true      | null      |
+    | exec.min_hash_table_size | LONG       | SYSTEM    | 65536     | null      | null      | null       |
+    | exec.java_compiler_janino_maxsize | LONG       | SYSTEM   | 262144    | null      | null      | null      |
+    | planner.enable_mergejoin | BOOLEAN    | SYSTEM    | null      | null      | true      | null       |
+    +------------+------------+------------+------------+------------+------------+------------+
+    10 rows selected (0.334 seconds)  
+  * name  
+The name of the option.
+  * kind  
+The data type of the option value.
+  * type  
+The type of options in the output: system, session, or boot.
+  * num_val  
+The default value, which is of the long or int data type; otherwise, null.
+  * string_val  
+The default value, which is a string; otherwise, null.
+  * bool_val  
+The default value, which is true or false; otherwise, null.
+  * float_val  
+The default value, which is of the double, float, or long double data type;
+otherwise, null.
+
+For information about how to configure Drill system and session options, see[
+Planning and Execution Options](/drill/docs/planning-and-execution-options).
+
+For information about how to configure Drill start-up options, see[ Start-Up
+Options](/drill/docs/start-up-options).
+

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/get-started/001-lesson1-connect.md
----------------------------------------------------------------------
diff --git a/_docs/query/get-started/001-lesson1-connect.md b/_docs/query/get-started/001-lesson1-connect.md
new file mode 100644
index 0000000..c4619c3
--- /dev/null
+++ b/_docs/query/get-started/001-lesson1-connect.md
@@ -0,0 +1,88 @@
+---
+title: "Lesson 1: Connect to Data Sources"
+parent: "Getting Started Tutorial"
+---
+This lesson shows how to connect to default data sources that Drill installs
+and configures through storage plugins. You learn how to list the storage
+plugins as you would list databases in SQL.
+
+## List the Storage Plugins
+
+To list the default storage plugins, use the SHOW DATABASES command.
+
+  1. Issue the SHOW DATABASES command.
+    
+        0: jdbc:drill:zk=local> SHOW DATABASES;  
+     The output lists the storage plugins, which you use as a SQL database, in
+<database>.<workspace> format.
+
+        +-------------+
+        | SCHEMA_NAME |
+        +-------------+
+        | dfs.default |
+        | dfs.root    |
+        | dfs.tmp     |
+        | cp.default  |
+        | sys         |
+        | INFORMATION_SCHEMA |
+        +-------------+
+        6 rows selected (0.977 seconds)
+
+  2. Take a look at the list of storage plugins and workspaces that Drill recognizes.
+
+* `dfs` is the storage plugin for connecting to the [file system](/drill/docs/querying-a-file-system) data source on your machine.
+* `cp` is a storage plugin for connecting to a JAR data source used with MapR.
+* `sys` is a storage plugin for connecting to Drill [system tables](/drill/docs/querying-system-tables).
+* [INFORMATION_SCHEMA](/drill/docs/querying-the-information-schema) is a storage plugin for connecting to an ANSI standard set of metadata tables.
+
+## List Tables
+
+You choose a storage plugin using the USE command. The output shows the status
+and description of the operation. After connecting to a data source, you can
+list available tables.
+
+  1. Select the `sys` storage plugin.
+  
+          USE sys;
+          +------------+------------+
+          |     ok     |  summary   |
+          +------------+------------+
+          | true       | Default schema changed to 'sys' |
+          +------------+------------+
+          1 row selected (0.034 seconds) 
+  2. List the tables in `sys`.
+  
+          SHOW TABLES;
+          0: jdbc:drill:zk=local> SHOW TABLES;  
+          
+          +--------------+------------+
+          | TABLE_SCHEMA | TABLE_NAME |
+          +--------------+------------+
+          | sys          | drillbits  |
+          | sys          | version    |
+          | sys          | options    |
+          +--------------+------------+
+  3. Select the INFORMATION_SCHEMA storage plugin.
+  
+          0: jdbc:drill:zk=local> USE INFORMATION_SCHEMA;
+ 
+          +------------+------------+
+          |     ok     |  summary   |
+          +------------+------------+
+          | true       | Default schema changed to 'INFORMATION_SCHEMA' |
+          +------------+------------+
+          1 row selected (0.023 seconds)
+  4. List the tables in INFORMATION_SCHEMA.
+
+          0: jdbc:drill:zk=local> SHOW TABLES;  
+ 
+          +--------------+------------+
+          | TABLE_SCHEMA | TABLE_NAME |
+          +--------------+------------+
+          | INFORMATION_SCHEMA | VIEWS      |
+          | INFORMATION_SCHEMA | COLUMNS    |
+          | INFORMATION_SCHEMA | TABLES     |
+          | INFORMATION_SCHEMA | CATALOGS   |
+          | INFORMATION_SCHEMA | SCHEMATA   |
+          +--------------+------------+
+          5 rows selected (0.082 seconds)

http://git-wip-us.apache.org/repos/asf/drill/blob/2a34ac89/_docs/query/get-started/002-lesson2-download.md
----------------------------------------------------------------------
diff --git a/_docs/query/get-started/002-lesson2-download.md b/_docs/query/get-started/002-lesson2-download.md
new file mode 100644
index 0000000..c936ab4
--- /dev/null
+++ b/_docs/query/get-started/002-lesson2-download.md
@@ -0,0 +1,103 @@
+---
+title: "Lesson 2: Query Plain Text"
+parent: "Getting Started Tutorial"
+---
+The lesson shows you how to query a plain text file. Drill handles plain text
+files and directories like standard SQL tables and can infer knowledge about
+the schema of the data. No setup is required. For example, you do not need to
+perform extract, transform, and load (ETL) operations on the data source.
+Exercises in the tutorial demonstrate the general guidelines for querying a
+plain text file:
+
+  * Use a storage plugin that defines the file format, such as comma-separated (CSV) or tab-separated values (TSV), of the data in the plain text file.
+  * In the SELECT statement, use the `COLUMNS[n]` syntax in lieu of column names, which do not exist in a plain text file. The first column is column `0`.
+  * In the FROM clause, use the path to the plain text file instead of using a table name. Enclose the path and file name in backticks. 
+
+## Prerequisites
+
+This lesson uses a tab-separated value (TSV) files that you download from a
+Google internet site. The data in the file consists of phrases from books that
+Google scans and generates for its [Google Books Ngram
+Viewer](http://storage.googleapis.com/books/ngrams/books/datasetsv2.html). You
+use the data to find the relative frequencies of Ngrams.
+
+## About the Data
+
+Each line in the TSV file has the following structure:
+
+`ngram TAB year TAB match_count TAB volume_count NEWLINE`
+
+For example, lines 1722089 and 1722090 in the file contain this data:
+
+<table ><tbody><tr><th >ngram</th><th >year</th><th colspan="1" >match_count</th><th >volume_count</th></tr><tr><td ><p class="p1">Zoological Journal of the Linnean</p></td><td >2007</td><td colspan="1" >284</td><td >101</td></tr><tr><td colspan="1" ><p class="p1">Zoological Journal of the Linnean</p></td><td colspan="1" >2008</td><td colspan="1" >257</td><td colspan="1" >87</td></tr></tbody></table> 
+  
+In 2007, "Zoological Journal of the Linnean" occurred 284 times overall in 101
+distinct books of the Google sample.
+
+## Download and Set Up the Data
+
+After downloading the file, you use the `dfs` storage plugin, and then select
+data from the file as you would a table. In the SELECT statement, enclose the
+path and name of the file in backticks.
+
+  1. Download the compressed Google Ngram data from this location:  
+    
+     http://storage.googleapis.com/books/ngrams/books/googlebooks-eng-all-5gram-20120701-zo.gz)
+
+  2. Unzip the file.  
+     A file named googlebooks-eng-all-5gram-20120701-zo appears.
+
+  3. Change the file name to add a `.tsv` extension.  
+The Drill `dfs` storage plugin definition includes a TSV format that requires
+a file to have this extension.
+
+## Query the Data
+
+Get data about "Zoological Journal of the Linnean" that appears more than 250
+times a year in the books that Google scans.
+
+  1. Switch back to using the `dfs` storage plugin.
+  
+          USE dfs;
+
+  2. Issue a SELECT statement to get the first three columns in the file. In the FROM clause of the example, substitute your path to the TSV file. In the WHERE clause, enclose the string literal "Zoological Journal of the Linnean" in single quotation marks. Limit the output to 10 rows.
+     
+         SELECT COLUMNS[0], COLUMNS[1], COLUMNS[2]
+         FROM `/Users/drilluser/Downloads/googlebooks-eng-all-5gram-20120701-zo.tsv`
+         WHERE ((columns[0] = 'Zoological Journal of the Linnean')
+           AND (columns[2] > 250)) LIMIT 10;
+           
+     The output is:
+     
+         +------------+------------+------------+
+         |   EXPR$0   |   EXPR$1   |   EXPR$2   |
+         +------------+------------+------------+
+         | Zoological Journal of the Linnean | 1993       | 297        |
+         | Zoological Journal of the Linnean | 1997       | 255        |
+         | Zoological Journal of the Linnean | 2003       | 254        |
+         | Zoological Journal of the Linnean | 2007       | 284        |
+         | Zoological Journal of the Linnean | 2008       | 257        |
+         +------------+------------+------------+
+         5 rows selected (1.599 seconds)
+
+  3. Repeat the query using aliases to replace the column headers, such as EXPR$0, with user-friendly column headers, Ngram, Publication Date, and Frequency. In the FROM clause of the example, substitute your path to the TSV file. 
+  
+         SELECT COLUMNS[0] AS Ngram,
+                COLUMNS[1] AS Publication_Date,
+                COLUMNS[2] AS Frequency
+         FROM `/Users/drilluser/Downloads/googlebooks-eng-all-5gram-20120701-zo.tsv`
+         WHERE ((columns[0] = 'Zoological Journal of the Linnean')
+             AND (columns[2] > 250)) LIMIT 10;
+
+     The improved output is:
+
+         +------------+------------------+------------+
+         |   Ngram    | Publication_Date | Frequency  |
+         +------------+------------------+------------+
+         | Zoological Journal of the Linnean | 1993             | 297        |
+         | Zoological Journal of the Linnean | 1997             | 255        |
+         | Zoological Journal of the Linnean | 2003             | 254        |
+         | Zoological Journal of the Linnean | 2007             | 284        |
+         | Zoological Journal of the Linnean | 2008             | 257        |
+         +------------+------------------+------------+
+         5 rows selected (1.628 seconds)


Mime
View raw message