drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject [11/13] drill git commit: DRILL-2315: Confluence conversion plus fixes
Date Thu, 26 Feb 2015 00:31:15 GMT
http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/design/005-value.md
----------------------------------------------------------------------
diff --git a/_docs/design/005-value.md b/_docs/design/005-value.md
new file mode 100644
index 0000000..828376a
--- /dev/null
+++ b/_docs/design/005-value.md
@@ -0,0 +1,163 @@
+---
+title: "Value Vectors"
+parent: "Design Docs"
+---
+This document defines the data structures required for passing sequences of
+columnar data between [Operators](https://docs.google.com/a/maprtech.com/document/d/1zaxkcrK9mYyfpGwX1kAV80z0PCi8abefL45zOzb97dI/edit#bookmark=id.iip15ful18mm).
+
+## Goals
+
+### Support Operators Written in Multiple Language
+
+ValueVectors should support operators written in C/C++/Assembly. To support
+this, the underlying ByteBuffer will not require modification when passed
+through the JNI interface. The ValueVector will be considered immutable once
+constructed. Endianness has not yet been considered.
+
+### Access
+
+Reading a random element from a ValueVector must be a constant time operation.
+To accomodate, elements are identified by their offset from the start of the
+buffer. Repeated, nullable and variable width ValueVectors utilize in an
+additional fixed width value vector to index each element. Write access is not
+supported once the ValueVector has been constructed by the RecordBatch.
+
+### Efficient Subsets of Value Vectors
+
+When an operator returns a subset of values from a ValueVector, it should
+reuse the original ValueVector. To accomplish this, a level of indirection is
+introduced to skip over certain values in the vector. This level of
+indirection is a sequence of offsets which reference an offset in the original
+ValueVector and the count of subsequent values which are to be included in the
+subset.
+
+### Pooled Allocation
+
+ValueVectors utilize one or more buffers under the covers. These buffers will
+be drawn from a pool. Value vectors are themselves created and destroyed as a
+schema changes during the course of record iteration.
+
+### Homogenous Value Types
+
+Each value in a Value Vector is of the same type. The [Record Batch](https://docs.google.com/a/maprtech.com/document/d/1zaxkcrK9mYyfpGwX1kAV80z0PCi8abefL45zOzb97dI/edit#bookmark=kix.s2xuoqnr8obe) implementation is responsible for
+creating a new Value Vector any time there is a change in schema.
+
+## Definitions
+
+Data Types
+
+The canonical source for value type definitions is the [Drill
+Datatypes](http://bit.ly/15JO9bC) document. The individual types are listed
+under the ‘Basic Data Types’ tab, while the value vector types can be found
+under the ‘Value Vectors’ tab.
+
+Operators
+
+An operator is responsible for transforming a stream of fields. It operates on
+Record Batches or constant values.
+
+Record Batch
+
+A set of field values for some range of records. The batch may be composed of
+Value Vectors, in which case each batch consists of exactly one schema.
+
+Value Vector
+
+The value vector is comprised of one or more contiguous buffers; one which
+stores a sequence of values, and zero or more which store any metadata
+associated with the ValueVector.
+
+## Data Structure
+
+A ValueVector stores values in a ByteBuf, which is a contiguous region of
+memory. Additional levels of indirection are used to support variable value
+widths, nullable values, repeated values and selection vectors. These levels
+of indirection are primarily lookup tables which consist of one or more fixed
+width ValueVectors which may be combined (e.g. for nullable, variable width
+values). A fixed width ValueVector of non-nullable, non-repeatable values does
+not require an indirect lookup; elements can be accessed directly by
+multiplying position by stride.
+
+Fixed Width Values
+
+Fixed width ValueVectors simply contain a packed sequence of values. Random
+access is supported by accessing element n at ByteBuf[0] + Index * Stride,
+where Index is 0-based. The following illustrates the underlying buffer of
+INT4 values [1 .. 6]:
+
+![drill query flow]({{ site.baseurl }}/docs/img/value1.png)
+
+Nullable Values
+
+Nullable values are represented by a vector of bit values. Each bit in the
+vector corresponds to an element in the ValueVector. If the bit is not set,
+the value is NULL. Otherwise the value is retrieved from the underlying
+buffer. The following illustrates a NullableValueVector of INT4 values 2, 3
+and 6:
+
+![drill query flow]({{ site.baseurl }}/docs/img/value2.png)
+  
+### Repeated Values
+
+A repeated ValueVector is used for elements which can contain multiple values
+(e.g. a JSON array). A table of offset and count pairs is used to represent
+each repeated element in the ValueVector. A count of zero means the element
+has no values (note the offset field is unused in this case). The following
+illustrates three fields; one with two values, one with no values, and one
+with a single value:
+
+![drill query flow]({{ site.baseurl }}/docs/img/value3.png)
+
+ValueVector Representation of the equivalent JSON:
+
+x:[1, 2]
+
+x:[ ]
+
+x:[3]
+
+Variable Width Values
+
+Variable width values are stored contiguously in a ByteBuf. Each element is
+represented by an entry in a fixed width ValueVector of offsets. The length of
+an entry is deduced by subtracting the offset of the following field. Because
+of this, the offset table will always contain one more entry than total
+elements, with the last entry pointing to the end of the buffer.
+
+![drill query flow]({{ site.baseurl }}/docs/img/value4.png)  
+
+Repeated Map Vectors
+
+A repeated map vector contains one or more maps (akin to an array of objects
+in JSON). The values of each field in the map are stored contiguously within a
+ByteBuf. To access a specific record, a lookup table of count and offset pairs
+is used. This lookup table points to the first repeated field in each column,
+while the count indicates the maximum number of elements for the column. The
+following example illustrates a RepeatedMap with two records; one with two
+objects, and one with a single object:
+
+![drill query flow]({{ site.baseurl }}/docs/img/value5.png)
+
+ValueVector representation of the equivalent JSON:
+
+x: [ {name:’Sam’, age:1}, {name:’Max’, age:2} ]
+
+x: [ {name:’Joe’, age:3} ]
+
+Selection Vectors
+
+A Selection Vector represents a subset of a ValueVector. It is implemented
+with a list of offsets which identify each element in the ValueVector to be
+included in the SelectionVector. In the case of a fixed width ValueVector, the
+offsets reference the underlying ByteBuf. In the case of a nullable, repeated
+or variable width ValueVector, the offset references the corresponding lookup
+table. The following illustrates a SelectionVector of INT4 (fixed width)
+values 2, 3 and 5 from the original vector of [1 .. 6]:
+
+![drill query flow]({{ site.baseurl }}/docs/img/value6.png)
+
+The following illustrates the same ValueVector with nullable fields:
+
+![drill query flow]({{ site.baseurl }}/docs/img/value7.png)
+
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/dev-custom-fcn/001-dev-simple.md
----------------------------------------------------------------------
diff --git a/_docs/dev-custom-fcn/001-dev-simple.md b/_docs/dev-custom-fcn/001-dev-simple.md
new file mode 100644
index 0000000..ebf3831
--- /dev/null
+++ b/_docs/dev-custom-fcn/001-dev-simple.md
@@ -0,0 +1,50 @@
+---
+title: "Develop a Simple Function"
+parent: "Develop Custom Functions"
+---
+Create a class within a Java package that implements Drill’s simple interface
+into the program, and include the required information for the function type.
+Your function must include data types that Drill supports, such as int or
+BigInt. For a list of supported data types, refer to the [SQL Reference](/drill/docs/sql-reference).
+
+Complete the following steps to develop a simple function using Drill’s simple
+function interface:
+
+  1. Create a Maven project and add the following dependency:
+  
+		<dependency>
+		<groupId>org.apache.drill.exec</groupId>
+		<artifactId>drill-java-exec</artifactId>
+		<version>1.0.0-m2-incubating-SNAPSHOT</version>
+		</dependency>
+
+  2. Create a class that implements the `DrillSimpleFunc` interface and identify the scope as `FunctionScope.SIMPLE`.
+
+	**Example**
+	
+		@FunctionTemplate(name = "myaddints", scope = FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL)
+		  public static class IntIntAdd implements DrillSimpleFunc {
+
+  3. Provide the variables used in the code in the `Param` and `Output` bit holders.
+
+	**Example**
+	
+		@Param IntHolder in1;
+		@Param IntHolder in2;
+		@Output IntHolder out;
+
+  4. Add the code that performs operations for the function in the `eval()` method.
+
+	**Example**
+	
+		public void setup(RecordBatch b) {
+		}
+		public void eval() {
+		 out.value = (int) (in1.value + in2.value);
+		}
+
+  5. Use the maven-source-plugin to compile the sources and classes JAR files. Verify that an empty `drill-module.conf` is included in the resources folder of the JARs.   
+Drill searches this module during classpath scanning. If the file is not
+included in the resources folder, you can add it to the JAR file or add it to
+`etc/drill/conf`.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/dev-custom-fcn/002-dev-aggregate.md
----------------------------------------------------------------------
diff --git a/_docs/dev-custom-fcn/002-dev-aggregate.md b/_docs/dev-custom-fcn/002-dev-aggregate.md
new file mode 100644
index 0000000..d1a3cfb
--- /dev/null
+++ b/_docs/dev-custom-fcn/002-dev-aggregate.md
@@ -0,0 +1,55 @@
+---
+title: "Developing an Aggregate Function"
+parent: "Develop Custom Functions"
+---
+Create a class within a Java package that implements Drill’s aggregate
+interface into the program. Include the required information for the function.
+Your function must include data types that Drill supports, such as int or
+BigInt. For a list of supported data types, refer to the [SQL Reference](/drill/docs/sql-reference).
+
+Complete the following steps to create an aggregate function:
+
+  1. Create a Maven project and add the following dependency:
+  
+		<dependency>
+		<groupId>org.apache.drill.exec</groupId>
+		<artifactId>drill-java-exec</artifactId>
+		<version>1.0.0-m2-incubating-SNAPSHOT</version>
+		</dependency>
+  2. Create a class that implements the `DrillAggFunc` interface and identify the scope as `FunctionTemplate.FunctionScope.POINT_AGGREGATE`.
+
+	**Example**
+	
+		@FunctionTemplate(name = "count", scope = FunctionTemplate.FunctionScope.POINT_AGGREGATE)
+		public static class BitCount implements DrillAggFunc{
+  3. Provide the variables used in the code in the `Param, Workspace, `and `Output` bit holders.
+
+	**Example**
+	
+		@Param BitHolder in;
+		@Workspace BitHolder value;
+		@Output BitHolder out;
+  4. Include the `setup(), add(), output(),` and `reset()` methods.
+	
+	**Example**
+		public void setup(RecordBatch b) {
+		  value = new BitHolder(); 
+		    value.value = 0;
+		}
+		 
+		@Override
+		public void add() {
+		      value.value++;
+		}
+		@Override
+		public void output() {
+		  out.value = value.value;
+		}
+		@Override
+		public void reset() {
+		 
+		    value.value = 0;
+  5. Use the maven-source-plugin to compile the sources and classes JAR files. Verify that an empty `drill-module.conf` is included in the resources folder of the JARs.   
+Drill searches this module during classpath scanning. If the file is not
+included in the resources folder, you can add it to the JAR file or add it to
+`etc/drill/conf`.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/dev-custom-fcn/003-add-custom.md
----------------------------------------------------------------------
diff --git a/_docs/dev-custom-fcn/003-add-custom.md b/_docs/dev-custom-fcn/003-add-custom.md
new file mode 100644
index 0000000..1858e44
--- /dev/null
+++ b/_docs/dev-custom-fcn/003-add-custom.md
@@ -0,0 +1,26 @@
+---
+title: "Adding Custom Functions to Drill"
+parent: "Develop Custom Functions"
+---
+After you develop your custom function and generate the sources and classes
+JAR files, add both JAR files to the Drill classpath, and include the name of
+the package that contains the classes to the main Drill configuration file.
+Restart the Drillbit on each node to refresh the configuration.
+
+To add a custom function to Drill, complete the following steps:
+
+  1. Add the sources JAR file and the classes JAR file for the custom function to the Drill classpath on all nodes running a Drillbit. To add the JAR files, copy them to `<drill installation directory>/jars/3rdparty`.
+  2. On all nodes running a Drillbit, add the name of the package that contains the classes to the main Drill configuration file in the following location:
+  
+        <drill installation directory>/conf/drill-override.conf
+	To add the package, add the package name to
+	`drill.logical.function.package+=`. Separate package names with a comma.
+	
+    **Example**
+		
+		drill.logical.function.package+= [“org.apache.drill.exec.expr.fn.impl","org.apache.drill.udfs”]
+  3. On each Drill node in the cluster, navigate to the Drill installation directory, and issue the following command to restart the Drillbit:
+  
+        <drill installation directory>/bin/drillbit.sh restart
+
+     Now you can issue queries with your custom functions to Drill.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/dev-custom-fcn/004-use-custom.md
----------------------------------------------------------------------
diff --git a/_docs/dev-custom-fcn/004-use-custom.md b/_docs/dev-custom-fcn/004-use-custom.md
new file mode 100644
index 0000000..6a0245a
--- /dev/null
+++ b/_docs/dev-custom-fcn/004-use-custom.md
@@ -0,0 +1,55 @@
+---
+title: "Using Custom Functions in Queries"
+parent: "Develop Custom Functions"
+---
+When you issue a query with a custom function to Drill, Drill searches the
+classpath for the function that matches the request in the query. Once Drill
+locates the function for the request, Drill processes the query and applies
+the function during processing.
+
+Your Drill installation includes sample files in the Drill classpath. One
+sample file, `employee.json`, contains some fictitious employee data that you
+can query with a custom function.
+
+## Simple Function Example
+
+This example uses the `myaddints` simple function in a query on the
+`employee.json` file.
+
+If you issue the following query to Drill, you can see all of the employee
+data within the `employee.json` file:
+
+    0: jdbc:drill:zk=local> SELECT * FROM cp.`employee.json`;
+
+The query returns the following results:
+
+	| employee_id | full_name    | first_name | last_name  | position_id | position_title          |  store_id  | department_id | birth_da |
+	+-------------+------------+------------+------------+-------------+----------------+------------+---------------+----------+-----------
+	| 1101        | Steve Eurich | Steve      | Eurich     | 16          | Store Temporary Checker | 12         | 16            |
+	| 1102        | Mary Pierson | Mary       | Pierson    | 16          | Store Temporary Checker | 12         | 16            |
+	| 1103        | Leo Jones    | Leo        | Jones      | 16          | Store Temporary Checker | 12         | 16            |
+	…
+
+Since the `postion_id` and `store_id` columns contain integers, you can issue
+a query with the `myaddints` custom function on these columns to add the
+integers in the columns.
+
+The following query tells Drill to apply the `myaddints` function to the
+`position_id` and `store_id` columns in the `employee.json` file:
+
+    0: jdbc:drill:zk=local> SELECT myaddints(CAST(position_id AS int),CAST(store_id AS int)) FROM cp.`employee.json`;
+
+Since JSON files do not store information about data types, you must apply the
+`CAST` function in the query to tell Drill that the columns contain integer
+values.
+
+The query returns the following results:
+
+	+------------+
+	|   EXPR$0   |
+	+------------+
+	| 28         |
+	| 28         |
+	| 36         |
+	+------------+
+	…
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/dev-custom-fcn/005-cust-interface.md
----------------------------------------------------------------------
diff --git a/_docs/dev-custom-fcn/005-cust-interface.md b/_docs/dev-custom-fcn/005-cust-interface.md
new file mode 100644
index 0000000..35af922
--- /dev/null
+++ b/_docs/dev-custom-fcn/005-cust-interface.md
@@ -0,0 +1,8 @@
+---
+title: "Custom Function Interfaces"
+parent: "Develop Custom Functions"
+---
+Implement the Drill interface appropriate for the type of function that you
+want to develop. Each interface provides a set of required holders where you
+input data types that your function uses and required methods that Drill calls
+to perform your function’s operations.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/develop/001-compile.md
----------------------------------------------------------------------
diff --git a/_docs/develop/001-compile.md b/_docs/develop/001-compile.md
new file mode 100644
index 0000000..2cf6ac9
--- /dev/null
+++ b/_docs/develop/001-compile.md
@@ -0,0 +1,37 @@
+---
+title: "Compiling Drill from Source"
+parent: "Develop Drill"
+---
+## Prerequisites
+
+  * Maven 3.0.4 or later
+  * Oracle JDK 7 or later
+
+Run the following commands to verify that you have the correct versions of
+Maven and JDK installed:
+
+    java -version
+    mvn -version
+
+## 1\. Clone the Repository
+
+    git clone https://git-wip-us.apache.org/repos/asf/incubator-drill.git
+
+## 2\. Compile the Code
+
+    cd incubator-drill
+    mvn clean install -DskipTests
+
+## 3\. Explode the Tarball in the Installation Directory
+
+    mkdir ~/compiled-drill
+    tar xvzf distribution/target/*.tar.gz --strip=1 -C ~/compiled-drill
+
+Now that you have Drill installed, you can connect to Drill and query sample
+data or you can connect Drill to your data sources.
+
+  * To connect Drill to your data sources, refer to [Connect to Data Sources](/drill/docs/connect-to-data-sources) for instructions.
+  * To connect to Drill and query sample data, refer to the following topics:
+    * [Start Drill ](/drill/docs/starting-stopping-drill)(For Drill installed in embedded mode)
+    * [Query Data ](/drill/docs/query-data)
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/develop/002-setup.md
----------------------------------------------------------------------
diff --git a/_docs/develop/002-setup.md b/_docs/develop/002-setup.md
new file mode 100644
index 0000000..19fb554
--- /dev/null
+++ b/_docs/develop/002-setup.md
@@ -0,0 +1,5 @@
+---
+title: "Setting Up Your Development Environment"
+parent: "Develop Drill"
+---
+TBD
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/develop/003-patch-tool.md
----------------------------------------------------------------------
diff --git a/_docs/develop/003-patch-tool.md b/_docs/develop/003-patch-tool.md
new file mode 100644
index 0000000..3ef3fe5
--- /dev/null
+++ b/_docs/develop/003-patch-tool.md
@@ -0,0 +1,160 @@
+---
+title: "Drill Patch Review Tool"
+parent: "Develop Drill"
+---
+  * Drill JIRA and Reviewboard script
+    * 1\. Setup
+    * 2\. Usage
+    * 3\. Upload patch
+    * 4\. Update patch
+  * JIRA command line tool
+    * 1\. Download the JIRA command line package
+    * 2\. Configure JIRA username and password
+  * Reviewboard
+    * 1\. Install the post-review tool
+    * 2\. Configure Stuff
+  * FAQ
+    * When I run the script, it throws the following error and exits
+    * When I run the script, it throws the following error and exits
+
+### Drill JIRA and Reviewboard script
+
+#### 1\. Setup
+
+  1. Follow instructions [here](/drill/docs/drill-patch-review-tool#jira-command-line-tool) to setup the jira-python package
+  2. Follow instructions [here](/drill/docs/drill-patch-review-tool#reviewboard) to setup the reviewboard python tools
+  3. Install the argparse module 
+  
+        On Linux -> sudo yum install python-argparse
+        On Mac -> sudo easy_install argparse
+
+#### 2\. Usage
+
+	nnarkhed-mn: nnarkhed$ python drill-patch-review.py --help
+	usage: drill-patch-review.py [-h] -b BRANCH -j JIRA [-s SUMMARY]
+	                             [-d DESCRIPTION] [-r REVIEWBOARD] [-t TESTING]
+	                             [-v VERSION] [-db] -rbu REVIEWBOARDUSER -rbp REVIEWBOARDPASSWORD
+	 
+	Drill patch review tool
+	 
+	optional arguments:
+	  -h, --help            show this help message and exit
+	  -b BRANCH, --branch BRANCH
+	                        Tracking branch to create diff against
+	  -j JIRA, --jira JIRA  JIRA corresponding to the reviewboard
+	  -s SUMMARY, --summary SUMMARY
+	                        Summary for the reviewboard
+	  -d DESCRIPTION, --description DESCRIPTION
+	                        Description for reviewboard
+	  -r REVIEWBOARD, --rb REVIEWBOARD
+	                        Review board that needs to be updated
+	  -t TESTING, --testing-done TESTING
+	                        Text for the Testing Done section of the reviewboard
+	  -v VERSION, --version VERSION
+	                        Version of the patch
+	  -db, --debug          Enable debug mode
+	  -rbu, --reviewboard-user Reviewboard user name
+	  -rbp, --reviewboard-password Reviewboard password
+
+#### 3\. Upload patch
+
+  1. Specify the branch against which the patch should be created (-b)
+  2. Specify the corresponding JIRA (-j)
+  3. Specify an **optional** summary (-s) and description (-d) for the reviewboard
+
+Example:
+
+    python drill-patch-review.py -b origin/master -j DRILL-241 -rbu tnachen -rbp password
+
+#### 4\. Update patch
+
+  1. Specify the branch against which the patch should be created (-b)
+  2. Specify the corresponding JIRA (--jira)
+  3. Specify the rb to be updated (-r)
+  4. Specify an **optional** summary (-s) and description (-d) for the reviewboard, if you want to update it
+  5. Specify an **optional** version of the patch. This will be appended to the jira to create a file named JIRA-<version>.patch. The purpose is to be able to upload multiple patches to the JIRA. This has no bearing on the reviewboard update.
+
+Example:
+
+    python drill-patch-review.py -b origin/master -j DRILL-241 -r 14081 rbp tnachen -rbp password
+
+### JIRA command line tool
+
+#### 1\. Download the JIRA command line package
+
+Install the jira-python package.
+
+    sudo easy_install jira-python
+
+#### 2\. Configure JIRA username and password
+
+Include a jira.ini file in your $HOME directory that contains your Apache JIRA
+username and password.
+
+	nnarkhed-mn:~ nnarkhed$ cat ~/jira.ini
+	user=nehanarkhede
+	password=***********
+
+### Reviewboard
+
+This is a quick tutorial on using [Review Board](https://reviews.apache.org)
+with Drill.
+
+#### 1\. Install the post-review tool
+
+If you are on RHEL, Fedora or CentOS, follow these steps:
+
+	sudo yum install python-setuptools
+	sudo easy_install -U RBTools
+
+If you are on Mac, follow these steps:
+
+	sudo easy_install -U setuptools
+	sudo easy_install -U RBTools
+
+For other platforms, follow the [instructions](http://www.reviewboard.org/docs/manual/dev/users/tools/post-review/) to
+setup the post-review tool.
+
+#### 2\. Configure Stuff
+
+Then you need to configure a few things to make it work.
+
+First set the review board url to use. You can do this from in git:
+
+    git config reviewboard.url https://reviews.apache.org
+
+If you checked out using the git wip http url that confusingly won't work with
+review board. So you need to configure an override to use the non-http url.
+You can do this by adding a config file like this:
+
+	jkreps$ cat ~/.reviewboardrc
+	REPOSITORY = 'git://git.apache.org/incubator-drill.git'
+	TARGET_GROUPS = 'drill-git'
+GUESS_FIELDS = True
+
+
+
+### FAQ
+
+#### When I run the script, it throws the following error and exits
+
+    nnarkhed$python drill-patch-review.py -b trunk -j DRILL-241
+    There don't seem to be any diffs
+
+There are two reasons for this:
+
+  * The code is not checked into your local branch
+  * The -b branch is not pointing to the remote branch. In the example above, "trunk" is specified as the branch, which is the local branch. The correct value for the -b (--branch) option is the remote branch. "git branch -r" gives the list of the remote branch names.
+
+#### When I run the script, it throws the following error and exits
+
+Error uploading diff
+ 
+Your review request still exists, but the diff is not attached.
+
+One of the most common root causes of this error are that the git remote
+branches are not up-to-date. Since the script already does that, it is
+probably due to some other problem. You can run the script with the --debug
+option that will make post-review run in the debug mode and list the root
+cause of the issue.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/001-arch.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/001-arch.md b/_docs/drill-docs/001-arch.md
deleted file mode 100644
index e4b26fc..0000000
--- a/_docs/drill-docs/001-arch.md
+++ /dev/null
@@ -1,58 +0,0 @@
----
-title: "Architectural Overview"
-parent: "Apache Drill Documentation"
----
-Apache Drill is a low latency distributed query engine for large-scale
-datasets, including structured and semi-structured/nested data. Inspired by
-Google’s Dremel, Drill is designed to scale to several thousands of nodes and
-query petabytes of data at interactive speeds that BI/Analytics environments
-require.
-
-### High-Level Architecture
-
-Drill includes a distributed execution environment, purpose built for large-
-scale data processing. At the core of Apache Drill is the ‘Drillbit’ service,
-which is responsible for accepting requests from the client, processing the
-queries, and returning results to the client.
-
-A Drillbit service can be installed and run on all of the required nodes in a
-Hadoop cluster to form a distributed cluster environment. When a Drillbit runs
-on each data node in the cluster, Drill can maximize data locality during
-query execution without moving data over the network or between nodes. Drill
-uses ZooKeeper to maintain cluster membership and health-check information.
-
-Though Drill works in a Hadoop cluster environment, Drill is not tied to
-Hadoop and can run in any distributed cluster environment. The only pre-
-requisite for Drill is Zookeeper.
-
-### Query Flow in Drill
-
-The following image represents the flow of a Drill query:
-
-![](../img/queryFlow.PNG?version=1&modifica
-tionDate=1400017845000&api=v2)  
-
-The flow of a Drill query typically involves the following steps:
-
-  1. The Drill client issues a query. Any Drillbit in the cluster can accept queries from clients. There is no master-slave concept.
-  2. The Drillbit then parses the query, optimizes it, and generates an optimized distributed query plan for fast and efficient execution.
-  3. The Drillbit that accepts the query becomes the driving Drillbit node for the request. It gets a list of available Drillbit nodes in the cluster from ZooKeeper. The driving Drillbit determines the appropriate nodes to execute various query plan fragments to maximize data locality.
-  4. The Drillbit schedules the execution of query fragments on individual nodes according to the execution plan.
-  5. The individual nodes finish their execution and return data to the driving Drillbit.
-  6. The driving Drillbit returns results back to the client.
-
-### Drill Clients
-
-You can access Drill through the following interfaces:
-
-  * Drill shell (SQLLine)
-  * Drill Web UI
-  * ODBC 
-  * JDBC
-  * C++ API
-
-Click on either of the following links to continue reading about Drill's
-architecture:
-
-  * [Core Modules within a Drillbit](/confluence/display/DRILL/Core+Modules+within+a+Drillbit)
-  * [Architectural Highlights](/confluence/display/DRILL/Architectural+Highlights)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/002-tutorial.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/002-tutorial.md b/_docs/drill-docs/002-tutorial.md
deleted file mode 100644
index 597f994..0000000
--- a/_docs/drill-docs/002-tutorial.md
+++ /dev/null
@@ -1,58 +0,0 @@
----
-title: "Apache Drill Tutorial"
-parent: "Apache Drill Documentation"
----
-This tutorial uses the MapR Sandbox, which is a Hadoop environment pre-
-configured with Apache Drill.
-
-To complete the tutorial on the MapR Sandbox with Apache Drill, work through
-the following pages in order:
-
-  * [Installing the Apache Drill Sandbox](/confluence/display/DRILL/Installing+the+Apache+Drill+Sandbox)
-  * [Getting to Know the Drill Setup](/confluence/display/DRILL/Getting+to+Know+the+Drill+Setup)
-  * [Lesson 1: Learn About the Data Set](/confluence/display/DRILL/Lesson+1%3A+Learn+About+the+Data+Set)
-  * [Lesson 2: Run Queries with ANSI SQL](/confluence/display/DRILL/Lesson+2%3A+Run+Queries+with+ANSI+SQL)
-  * [Lesson 3: Run Queries on Complex Data Types](/confluence/display/DRILL/Lesson+3%3A+Run+Queries+on+Complex+Data+Types)
-  * [Summary](/confluence/display/DRILL/Summary)
-
-# About Apache Drill
-
-Drill is an Apache open-source SQL query engine for Big Data exploration.
-Drill is designed from the ground up to support high-performance analysis on
-the semi-structured and rapidly evolving data coming from modern Big Data
-applications, while still providing the familiarity and ecosystem of ANSI SQL,
-the industry-standard query language. Drill provides plug-and-play integration
-with existing Apache Hive and Apache HBase deployments.Apache Drill 0.5 offers
-the following key features:
-
-  * Low-latency SQL queries
-
-  * Dynamic queries on self-describing data in files (such as JSON, Parquet, text) and MapR-DB/HBase tables, without requiring metadata definitions in the Hive metastore.
-
-  * ANSI SQL
-
-  * Nested data support
-
-  * Integration with Apache Hive (queries on Hive tables and views, support for all Hive file formats and Hive UDFs)
-
-  * BI/SQL tool integration using standard JDBC/ODBC drivers
-
-# MapR Sandbox with Apache Drill
-
-MapR includes Apache Drill as part of the Hadoop distribution. The MapR
-Sandbox with Apache Drill is a fully functional single-node cluster that can
-be used to get an overview on Apache Drill in a Hadoop environment. Business
-and technical analysts, product managers, and developers can use the sandbox
-environment to get a feel for the power and capabilities of Apache Drill by
-performing various types of queries. Once you get a flavor for the technology,
-refer to the [Apache Drill web site](http://incubator.apache.org/drill/) and
-[Apache Drill documentation
-](https://cwiki.apache.org/confluence/display/DRILL/Apache+Drill+Wiki)for more
-details.
-
-Note that Hadoop is not a prerequisite for Drill and users can start ramping
-up with Drill by running SQL queries directly on the local file system. Refer
-to [Apache Drill in 10 minutes](https://cwiki.apache.org/confluence/display/DR
-ILL/Apache+Drill+in+10+Minutes) for an introduction to using Drill in local
-(embedded) mode.
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/003-yelp.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/003-yelp.md b/_docs/drill-docs/003-yelp.md
deleted file mode 100644
index b9339ed..0000000
--- a/_docs/drill-docs/003-yelp.md
+++ /dev/null
@@ -1,402 +0,0 @@
----
-title: "Analyzing Yelp JSON Data with Apache Drill"
-parent: "Apache Drill Documentation"
----
-[Apache Drill](https://www.mapr.com/products/apache-drill) is one of the
-fastest growing open source projects, with the community making rapid progress
-with monthly releases. The key difference is Drill’s agility and flexibility.
-Along with meeting the table stakes for SQL-on-Hadoop, which is to achieve low
-latency performance at scale, Drill allows users to analyze the data without
-any ETL or up-front schema definitions. The data could be in any file format
-such as text, JSON, or Parquet. Data could have simple types such as string,
-integer, dates, or more complex multi-structured data, such as nested maps and
-arrays. Data can exist in any file system, local or distributed, such as HDFS,
-[MapR FS](https://www.mapr.com/blog/comparing-mapr-fs-and-hdfs-nfs-and-
-snapshots), or S3. Drill, has a “no schema” approach, which enables you to get
-value from your data in just a few minutes.
-
-Let’s quickly walk through the steps required to install Drill and run it
-against the Yelp data set. The publicly available data set used for this
-example is downloadable from [Yelp](http://www.yelp.com/dataset_challenge)
-(business reviews) and is in JSON format.
-
-## Installing and Starting Drill
-
-### Step 1: Download Apache Drill onto your local machine
-
-[http://incubator.apache.org/drill/download/](http://incubator.apache.org/drill/download/)
-
-You can also [deploy Drill in clustered mode](https://cwiki.apache.org/conflue
-nce/display/DRILL/Deploying+Apache+Drill+in+a+Clustered+Environment) if you
-want to scale your environment.
-
-### Step 2 : Open the Drill tar file
-
-`tar -xvf apache-drill-0.6.0-incubating.tar`
-
-### Step 3: Launch sqlline, a JDBC application that ships with Drill
-
-`bin/sqlline -u jdbc:drill:zk=local`
-
-That’s it! You are now ready explore the data.
-
-Let’s try out some SQL examples to understand how Drill makes the raw data
-analysis extremely easy.
-
-**Note**: You need to substitute your local path to the Yelp data set in the FROM clause of each query you run.
-
-## Querying Data with Drill
-
-### **1\. View the contents of the Yelp business data**
-
-`0: jdbc:drill:zk=local> !set maxwidth 10000`
-
-``0: jdbc:drill:zk=local> select * from
-dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json`
-limit 1;``
-
-    +-------------+--------------+------------+------------+------------+------------+--------------+------------+------------+------------+------------+------------+------------+------------+---------------+
-    | business_id | full_address |   hours    |     open    | categories |            city    | review_count |        name   | longitude  |   state  |   stars          |  latitude  | attributes |          type    | neighborhoods |
-    +-------------+--------------+------------+------------+------------+------------+--------------+------------+------------+------------+------------+------------+------------+------------+---------------+
-    | vcNAWiLM4dR7D2nwwJ7nCA | 4840 E Indian School Rd
-    Ste 101
-    Phoenix, AZ 85018 | {"Tuesday":{"close":"17:00","open":"08:00"},"Friday":{"close":"17:00","open":"08:00"},"Monday":{"close":"17:00","open":"08:00"},"Wednesday":{"close":"17:00","open":"08:00"},"Thursday":{"close":"17:00","open":"08:00"},"Sunday":{},"Saturday":{}} | true              | ["Doctors","Health & Medical"] | Phoenix  | 7                   | Eric Goldberg, MD | -111.983758 | AZ       | 3.5                | 33.499313  | {"By Appointment Only":true,"Good For":{},"Ambience":{},"Parking":{},"Music":{},"Hair Types Specialized In":{},"Payment Types":{},"Dietary Restrictions":{}} | business   | []                  |
-    +-------------+--------------+------------+------------+------------+------------+--------------+------------+------------+------------+------------+------------+------------+------------+---------------+
-
-**Note: **You can directly query self-describing files such as JSON, Parquet, and text. There is no need to create metadata definitions in the Hive metastore.
-
-### **2\. Explore the business data set further**
-
-#### Total reviews in the data set
-
-``0: jdbc:drill:zk=local> select sum(review_count) as totalreviews from
-dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json`
-;``
-
-    +--------------+
-    | totalreviews |
-    +--------------+
-    | 1236445      |
-    +--------------+
-
-#### Top states and cities in total number of reviews
-
-``0: jdbc:drill:zk=local> select state, city, count(*) totalreviews from
-dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json`
-group by state, city order by count(*) desc limit 10;``
-
-    +------------+------------+--------------+
-    |   state    |    city    | totalreviews |
-    +------------+------------+--------------+
-    | NV         | Las Vegas  | 12021        |
-    | AZ         | Phoenix    | 7499         |
-    | AZ         | Scottsdale | 3605         |
-    | EDH        | Edinburgh  | 2804         |
-    | AZ         | Mesa       | 2041         |
-    | AZ         | Tempe      | 2025         |
-    | NV         | Henderson  | 1914         |
-    | AZ         | Chandler   | 1637         |
-    | WI         | Madison    | 1630         |
-    | AZ         | Glendale   | 1196         |
-    +------------+------------+--------------+
-
-#### **Average number of reviews per business star rating**
-
-``0: jdbc:drill:zk=local> select stars,trunc(avg(review_count)) reviewsavg from
-dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json`
-group by stars order by stars desc;``
-
-    +------------+------------+
-    |   stars    | reviewsavg |
-    +------------+------------+
-    | 5.0        | 8.0        |
-    | 4.5        | 28.0       |
-    | 4.0        | 48.0       |
-    | 3.5        | 35.0       |
-    | 3.0        | 26.0       |
-    | 2.5        | 16.0       |
-    | 2.0        | 11.0       |
-    | 1.5        | 9.0        |
-    | 1.0        | 4.0        |
-    +------------+------------+
-
-#### **Top businesses with high review counts (> 1000)**
-
-``0: jdbc:drill:zk=local> select name, state, city, `review_count` from
-dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json`
-where review_count > 1000 order by `review_count` desc limit 10;``
-
-    +------------+------------+------------+----------------------------+
-    |    name                |   state     |    city     | review_count |
-    +------------+------------+------------+----------------------------+
-    | Mon Ami Gabi           | NV          | Las Vegas  | 4084          |
-    | Earl of Sandwich       | NV          | Las Vegas  | 3655          |
-    | Wicked Spoon           | NV          | Las Vegas  | 3408          |
-    | The Buffet             | NV          | Las Vegas  | 2791          |
-    | Serendipity 3          | NV          | Las Vegas  | 2682          |
-    | Bouchon                | NV          | Las Vegas  | 2419          |
-    | The Buffet at Bellagio | NV          | Las Vegas  | 2404          |
-    | Bacchanal Buffet       | NV          | Las Vegas  | 2369          |
-    | The Cosmopolitan of Las Vegas | NV   | Las Vegas  | 2253          |
-    | Aria Hotel & Casino    | NV          | Las Vegas  | 2224          |
-    +------------+------------+------------+----------------------------+
-
-#### **Saturday open and close times for a few businesses**
-
-``0: jdbc:drill:zk=local> select b.name, b.hours.Saturday.`open`,
-b.hours.Saturday.`close`  
-from
-dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json`
-b limit 10;``
-
-    +------------+------------+----------------------------+
-    |    name                    |   EXPR$1   |   EXPR$2   |
-    +------------+------------+----------------------------+
-    | Eric Goldberg, MD          | 08:00      | 17:00      |
-    | Pine Cone Restaurant       | null       | null       |
-    | Deforest Family Restaurant | 06:00      | 22:00      |
-    | Culver's                   | 10:30      | 22:00      |
-    | Chang Jiang Chinese Kitchen| 11:00      | 22:00      |
-    | Charter Communications     | null       | null       |
-    | Air Quality Systems        | null       | null       |
-    | McFarland Public Library   | 09:00      | 20:00      |
-    | Green Lantern Restaurant   | 06:00      | 02:00      |
-    | Spartan Animal Hospital    | 07:30      | 18:00      |
-    +------------+------------+----------------------------+
-
-** **Note how Drill can traverse and refer through multiple levels of nesting.
-
-### **3\. Get the amenities of each business in the data set**
-
-Note that the attributes column in the Yelp business data set has a different
-element for every row, representing that businesses can have separate
-amenities. Drill makes it easy to quickly access data sets with changing
-schemas.
-
-First, change Drill to work in all text mode (so we can take a look at all of
-the data).
-
-    0: jdbc:drill:zk=local> alter system set `store.json.all_text_mode` = true;
-    +------------+-----------------------------------+
-    |     ok     |  summary                          |
-    +------------+-----------------------------------+
-    | true       | store.json.all_text_mode updated. |
-    +------------+-----------------------------------+
-
-Then, query the attribute’s data.
-
-    0: jdbc:drill:zk=local> select attributes from dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json` limit 10;
-    +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    | attributes                                                                                                                                                                       |
-    +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    | {"By Appointment Only":"true","Good For":{},"Ambience":{},"Parking":{},"Music":{},"Hair Types Specialized In":{},"Payment Types":{},"Dietary Restrictions":{}} |
-    | {"Take-out":"true","Good For":{"dessert":"false","latenight":"false","lunch":"true","dinner":"false","breakfast":"false","brunch":"false"},"Caters":"false","Noise Level":"averag |
-    | {"Take-out":"true","Good For":{"dessert":"false","latenight":"false","lunch":"false","dinner":"false","breakfast":"false","brunch":"true"},"Caters":"false","Noise Level":"quiet" |
-    | {"Take-out":"true","Good For":{},"Takes Reservations":"false","Delivery":"false","Ambience":{},"Parking":{"garage":"false","street":"false","validated":"false","lot":"true","val |
-    | {"Take-out":"true","Good For":{},"Ambience":{},"Parking":{},"Has TV":"false","Outdoor Seating":"false","Attire":"casual","Music":{},"Hair Types Specialized In":{},"Payment Types |
-    | {"Good For":{},"Ambience":{},"Parking":{},"Music":{},"Hair Types Specialized In":{},"Payment Types":{},"Dietary Restrictions":{}} |
-    | {"Good For":{},"Ambience":{},"Parking":{},"Music":{},"Hair Types Specialized In":{},"Payment Types":{},"Dietary Restrictions":{}} |
-    | {"Good For":{},"Ambience":{},"Parking":{},"Wi-Fi":"free","Music":{},"Hair Types Specialized In":{},"Payment Types":{},"Dietary Restrictions":{}} |
-    | {"Take-out":"true","Good For":{"dessert":"false","latenight":"false","lunch":"false","dinner":"true","breakfast":"false","brunch":"false"},"Noise Level":"average","Takes Reserva |
-    | {"Good For":{},"Ambience":{},"Parking":{},"Music":{},"Hair Types Specialized In":{},"Payment Types":{},"Dietary Restrictions":{}} |
-    +------------+
-
-Turn off the all text mode so we can continue to perform arithmetic operations
-on data.
-
-    0: jdbc:drill:zk=local> alter system set `store.json.all_text_mode` = false;
-    +------------+------------+
-    |     ok             |  summary   |
-    +------------+------------+
-    | true              | store.json.all_text_mode updated. |
-
-**4\. Explore the restaurant businesses in the data set**
-
-#### **Number of restaurants in the data set**** **
-
-    0: jdbc:drill:zk=local> select count(*) as TotalRestaurants from dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json` where true=repeated_contains(categories,'Restaurants');
-    +------------------+
-    | TotalRestaurants |
-    +------------------+
-    | 14303            |
-    +------------------+
-
-#### **Top restaurants in number of reviews**
-
-    0: jdbc:drill:zk=local> select name,state,city,`review_count` from dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json` where true=repeated_contains(categories,'Restaurants') order by `review_count` desc limit 10
-    . . . . . . . . . . . > ;
-    +------------+------------+------------+--------------+
-    |    name         |   state    |    city     | review_count |
-    +------------+------------+------------+--------------+
-    | Mon Ami Gabi | NV               | Las Vegas  | 4084         |
-    | Earl of Sandwich | NV         | Las Vegas  | 3655         |
-    | Wicked Spoon | NV             | Las Vegas  | 3408         |
-    | The Buffet | NV       | Las Vegas  | 2791         |
-    | Serendipity 3 | NV              | Las Vegas  | 2682         |
-    | Bouchon       | NV         | Las Vegas  | 2419           |
-    | The Buffet at Bellagio | NV             | Las Vegas  | 2404         |
-    | Bacchanal Buffet | NV        | Las Vegas  | 2369         |
-    | Hash House A Go Go | NV                | Las Vegas  | 2201         |
-    | Mesa Grill | NV         | Las Vegas  | 2004         |
-    +------------+------------+------------+--------------+
-
-**Top restaurants in number of listed categories**
-
-    0: jdbc:drill:zk=local> select name,repeated_count(categories) as categorycount, categories from dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json` where true=repeated_contains(categories,'Restaurants') order by repeated_count(categories) desc limit 10;
-    +------------+---------------+------------+
-    |    name         | categorycount | categories |
-    +------------+---------------+------------+
-    | Binion's Hotel & Casino | 10           | ["Arts & Entertainment","Restaurants","Bars","Casinos","Event Planning & Services","Lounges","Nightlife","Hotels & Travel","American (N |
-    | Stage Deli | 10        | ["Arts & Entertainment","Food","Hotels","Desserts","Delis","Casinos","Sandwiches","Hotels & Travel","Restaurants","Event Planning & Services"] |
-    | Jillian's  | 9               | ["Arts & Entertainment","American (Traditional)","Music Venues","Bars","Dance Clubs","Nightlife","Bowling","Active Life","Restaurants"] |
-    | Hotel Chocolat | 9               | ["Coffee & Tea","Food","Cafes","Chocolatiers & Shops","Specialty Food","Event Planning & Services","Hotels & Travel","Hotels","Restaurants"] |
-    | Hotel du Vin & Bistro Edinburgh | 9           | ["Modern European","Bars","French","Wine Bars","Event Planning & Services","Nightlife","Hotels & Travel","Hotels","Restaurants" |
-    | Elixir             | 9             | ["Arts & Entertainment","American (Traditional)","Music Venues","Bars","Cocktail Bars","Nightlife","American (New)","Local Flavor","Restaurants"] |
-    | Tocasierra Spa and Fitness | 8                  | ["Beauty & Spas","Gyms","Medical Spas","Health & Medical","Fitness & Instruction","Active Life","Day Spas","Restaurants"] |
-    | Costa Del Sol At Sunset Station | 8            | ["Steakhouses","Mexican","Seafood","Event Planning & Services","Hotels & Travel","Italian","Restaurants","Hotels"] |
-    | Scottsdale Silverado Golf Club | 8              | ["Fashion","Shopping","Sporting Goods","Active Life","Golf","American (New)","Sports Wear","Restaurants"] |
-    | House of Blues | 8               | ["Arts & Entertainment","Music Venues","Restaurants","Hotels","Event Planning & Services","Hotels & Travel","American (New)","Nightlife"] |
-    +------------+---------------+------------+
-
-#### **Top first categories in number of review counts**
-
-    0: jdbc:drill:zk=local> select categories[0], count(categories[0]) as categorycount from dfs.`/users/nrentachintala/Downloads/yelp_dataset_challenge_academic_dataset/yelp_academic_dataset_business.json` group by categories[0] 
-    order by count(categories[0]) desc limit 10;
-    +------------+---------------+
-    |   EXPR$0   | categorycount |
-    +------------+---------------+
-    | Food       | 4294          |
-    | Shopping   | 1885          |
-    | Active Life | 1676          |
-    | Bars       | 1366          |
-    | Local Services | 1351          |
-    | Mexican    | 1284          |
-    | Hotels & Travel | 1283          |
-    | Fast Food  | 963           |
-    | Arts & Entertainment | 906           |
-    | Hair Salons | 901           |
-    +------------+---------------+
-
-**5\. Explore the Yelp reviews dataset and combine with the businesses.**** **
-
-#### **Take a look at the contents of the Yelp reviews dataset.**** **
-
-    0: jdbc:drill:zk=local> select * from dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_review.json` limit 1;
-    +------------+------------+------------+------------+------------+------------+------------+-------------+
-    |   votes          |  user_id   | review_id  |   stars    |            date    |    text           |          type    | business_id |
-    +------------+------------+------------+------------+------------+------------+------------+-------------+
-    | {"funny":0,"useful":2,"cool":1} | Xqd0DzHaiyRqVH3WRG7hzg | 15SdjuK7DmYqUAj6rjGowg | 5            | 2007-05-17 | dr. goldberg offers everything i look for in a general practitioner.  he's nice and easy to talk to without being patronizing; he's always on time in seeing his patients; he's affiliated with a top-notch hospital (nyu) which my parents have explained to me is very important in case something happens and you need surgery; and you can get referrals to see specialists without having to see him first.  really, what more do you need?  i'm sitting here trying to think of any complaints i have about him, but i'm really drawing a blank. | review | vcNAWiLM4dR7D2nwwJ7nCA |
-    +------------+------------+------------+------------+------------+------------+------------+-------------+
-
-#### **Top businesses with cool rated reviews**
-
-Note that we are combining the Yelp business data set that has the overall
-review_count to the Yelp review data, which holds additional details on each
-of the reviews themselves.
-
-    0: jdbc:drill:zk=local> Select b.name from dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json` b where b.business_id in (SELECT r.business_id FROM dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_review.json` r
-    GROUP BY r.business_id having sum(r.votes.cool) > 2000 order by sum(r.votes.cool)  desc);
-    +------------+
-    |    name         |
-    +------------+
-    | Earl of Sandwich |
-    | XS Nightclub |
-    | The Cosmopolitan of Las Vegas |
-    | Wicked Spoon |
-    +------------+
-
-**Create a view with the combined business and reviews data sets**
-
-Note that Drill views are lightweight, and can just be created in the local
-file system. Drill in standalone mode comes with a dfs.tmp workspace, which we
-can use to create views (or you can can define your own workspaces on a local
-or distributed file system). If you want to persist the data physically
-instead of in a logical view, you can use CREATE TABLE AS SELECT syntax.
-
-    0: jdbc:drill:zk=local> create or replace view dfs.tmp.businessreviews as Select b.name,b.stars,b.state,b.city,r.votes.funny,r.votes.useful,r.votes.cool, r.`date` from dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json` b , dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_review.json` r where r.business_id=b.business_id
-    +------------+------------+
-    |     ok             |  summary   |
-    +------------+------------+
-    | true              | View 'businessreviews' created successfully in 'dfs.tmp' schema |
-    +------------+------------+
-
-Let’s get the total number of records from the view.
-
-    0: jdbc:drill:zk=local> select count(*) as Total from dfs.tmp.businessreviews;
-    +------------+
-    |   Total   |
-    +------------+
-    | 1125458       |
-    +------------+
-
-In addition to these queries, you can get many more deeper insights using
-Drill’s [SQL functionality](https://cwiki.apache.org/confluence/display/DRILL/
-SQL+Reference). If you are not comfortable with writing queries manually, you
-can use a BI/Analytics tools such as Tableau/MicroStrategy to query raw
-files/Hive/HBase data or Drill-created views directly using Drill ODBC/JDBC
-drivers.
-
-The goal of Apache Drill is to provide the freedom and flexibility in
-exploring data in ways we have never seen before with SQL technologies. The
-community is working on more exciting features around nested data and
-supporting data with changing schemas in upcoming releases.
-
-As an example, a new FLATTEN function is in development (an upcoming feature
-in 0.7). This function can be used to dynamically rationalize semi-structured
-data so you can apply even deeper SQL functionality. Here is a sample query:
-
-#### **Get a flattened list of categories for each business**
-
-    0: jdbc:drill:zk=local> select name, flatten(categories) as category from dfs.`/users/nrentachintala/Downloads/yelp/yelp_academic_dataset_business.json`  limit 20;
-    +------------+------------+
-    |    name         |   category   |
-    +------------+------------+
-    | Eric Goldberg, MD | Doctors          |
-    | Eric Goldberg, MD | Health & Medical |
-    | Pine Cone Restaurant | Restaurants |
-    | Deforest Family Restaurant | American (Traditional) |
-    | Deforest Family Restaurant | Restaurants |
-    | Culver's   | Food       |
-    | Culver's   | Ice Cream & Frozen Yogurt |
-    | Culver's   | Fast Food  |
-    | Culver's   | Restaurants |
-    | Chang Jiang Chinese Kitchen | Chinese    |
-    | Chang Jiang Chinese Kitchen | Restaurants |
-    | Charter Communications | Television Stations |
-    | Charter Communications | Mass Media |
-    | Air Quality Systems | Home Services |
-    | Air Quality Systems | Heating & Air Conditioning/HVAC |
-    | McFarland Public Library | Libraries  |
-    | McFarland Public Library | Public Services & Government |
-    | Green Lantern Restaurant | American (Traditional) |
-    | Green Lantern Restaurant | Restaurants |
-    | Spartan Animal Hospital | Veterinarians |
-    +------------+------------+
-
-**Top categories used in business reviews**
-
-    0: jdbc:drill:zk=local> select celltbl.catl, count(celltbl.catl) categorycnt from (select flatten(categories) catl from dfs.`/users/nrentachintala/Downloads/yelp_dataset_challenge_academic_dataset/yelp_academic_dataset_business.json` )  celltbl group by celltbl.catl order by count(celltbl.catl) desc limit 10 ;
-    +------------+-------------+
-    |    catl    | categorycnt |
-    +------------+-------------+
-    | Restaurants | 14303       |
-    | Shopping   | 6428        |
-    | Food       | 5209        |
-    | Beauty & Spas | 3421        |
-    | Nightlife  | 2870        |
-    | Bars       | 2378        |
-    | Health & Medical | 2351        |
-    | Automotive | 2241        |
-    | Home Services | 1957        |
-    | Fashion    | 1897        |
-    +------------+-------------+
-
-Stay tuned for more features and upcoming activities in the Drill community.
-
-To learn more about Drill, please refer to the following resources:
-
-  * Download Drill here:<http://incubator.apache.org/drill/download/>
-  * 10 reasons we think Drill is cool:<http://incubator.apache.org/drill/why-drill/>
-  * A simple 10-minute tutorial:<https://cwiki.apache.org/confluence/display/DRILL/Apache+Drill+in+10+Minutes>
-  * A more comprehensive tutorial:<https://cwiki.apache.org/confluence/display/DRILL/Apache+Drill+Tutorial>
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/004-install.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/004-install.md b/_docs/drill-docs/004-install.md
deleted file mode 100644
index fe7578c..0000000
--- a/_docs/drill-docs/004-install.md
+++ /dev/null
@@ -1,20 +0,0 @@
----
-title: "Install Drill"
-parent: "Apache Drill Documentation"
----
-You can install Drill in embedded mode or in distributed mode. Installing
-Drill in embedded mode does not require any configuration, which means that
-you can quickly get started with Drill. If you want to use Drill in a
-clustered Hadoop environment, you can install Drill in distributed mode.
-Installing in distributed mode requires some configuration, however once you
-install you can connect Drill to your Hive, HBase, or distributed file system
-data sources and run queries on them.
-
-Click on any of the following links for more information about how to install
-Drill in embedded or distributed mode:
-
-  * [Apache Drill in 10 Minutes](/confluence/display/DRILL/Apache+Drill+in+10+Minutes)
-  * [Deploying Apache Drill in a Clustered Environment](/confluence/display/DRILL/Deploying+Apache+Drill+in+a+Clustered+Environment)
-  * [Installing Drill in Embedded Mode](/confluence/display/DRILL/Installing+Drill+in+Embedded+Mode)
-  * [Installing Drill in Distributed Mode](/confluence/display/DRILL/Installing+Drill+in+Distributed+Mode)
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/005-connect.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/005-connect.md b/_docs/drill-docs/005-connect.md
deleted file mode 100644
index 039fc78..0000000
--- a/_docs/drill-docs/005-connect.md
+++ /dev/null
@@ -1,49 +0,0 @@
----
-title: "Connect to Data Sources"
-parent: "Apache Drill Documentation"
----
-Apache Drill serves as a query layer that connects to data sources through
-storage plugins. Drill uses the storage plugins to interact with data sources.
-You can think of a storage plugin as a connection between Drill and a data
-source.
-
-The following image represents the storage plugin layer between Drill and a
-data source:
-
-![](../img/storageplugin.png)
-
-Storage plugins provide the following information to Drill:
-
-  * Metadata available in the underlying data source
-  * Location of data
-  * Interfaces that Drill can use to read from and write to data sources
-  * A set of storage plugin optimization rules that assist with efficient and faster execution of Drill queries, such as pushdowns, statistics, and partition awareness
-
-Storage plugins perform scanner and writer functions, and inform the metadata
-repository of any known metadata, such as:
-
-  * Schema
-  * File size
-  * Data ordering
-  * Secondary indices
-  * Number of blocks
-
-Storage plugins inform the execution engine of any native capabilities, such
-as predicate pushdown, joins, and SQL.
-
-Drill provides storage plugins for files and HBase/M7. Drill also integrates
-with Hive through a storage plugin. Hive provides a metadata abstraction layer
-on top of files and HBase/M7.
-
-When you run Drill to query files in HBase/M7, Drill can perform direct
-queries on the data or go through Hive, if you have metadata defined there.
-Drill integrates with the Hive metastore for metadata and also uses a Hive
-SerDe for the deserialization of records. Drill does not invoke the Hive
-execution engine for any requests.
-
-For information about how to connect Drill to your data sources, refer to
-storage plugin registration:
-
-  * [Storage Plugin Registration](/confluence/display/DRILL/Storage+Plugin+Registration)
-  * [MongoDB Plugin for Apache Drill](/confluence/display/DRILL/MongoDB+Plugin+for+Apache+Drill)
-  * [MapR-DB Plugin for Apache Drill](/confluence/display/DRILL/MapR-DB+Plugin+for+Apache+Drill)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/006-query.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/006-query.md b/_docs/drill-docs/006-query.md
deleted file mode 100644
index 4b4fda0..0000000
--- a/_docs/drill-docs/006-query.md
+++ /dev/null
@@ -1,57 +0,0 @@
----
-title: "Query Data"
-parent: "Apache Drill Documentation"
----
-You can query local and distributed file systems, Hive, and HBase data sources
-registered with Drill. If you connected directly to a particular schema when
-you invoked SQLLine, you can issue SQL queries against that schema. If you did
-not indicate a schema when you invoked SQLLine, you can issue the `USE
-<schema>` statement to run your queries against a particular schema. After you
-issue the `USE` statement, you can use absolute notation, such as
-`schema.table.column`.
-
-Click on any of the following links for information about various data source
-queries and examples:
-
-  * [Querying a File System](/confluence/display/DRILL/Querying+a+File+System)
-  * [Querying HBase](/confluence/display/DRILL/Querying+HBase)
-  * [Querying Hive](/confluence/display/DRILL/Querying+Hive)
-  * [Querying Complex Data](/confluence/display/DRILL/Querying+Complex+Data)
-  * [Querying the INFORMATION_SCHEMA](/confluence/display/DRILL/Querying+the+INFORMATION_SCHEMA)
-  * [Querying System Tables](/confluence/display/DRILL/Querying+System+Tables)
-  * [Drill Interfaces](/confluence/display/DRILL/Drill+Interfaces)
-
-You may need to use casting functions in some queries. For example, you may
-have to cast a string `"100"` to an integer in order to apply a math function
-or an aggregate function.
-
-You can use the EXPLAIN command to analyze errors and troubleshoot queries
-that do not run. For example, if you run into a casting error, the query plan
-text may help you isolate the problem.
-
-    0: jdbc:drill:zk=local> !set maxwidth 10000
-    0: jdbc:drill:zk=local> explain plan for select ... ;
-
-The set command increases the default text display (number of characters). By
-default, most of the plan output is hidden.
-
-You may see errors if you try to use non-standard or unsupported SQL syntax in
-a query.
-
-Remember the following tips when querying data with Drill:
-
-  * Include a semicolon at the end of SQL statements, except when you issue a command with an exclamation point `(!).   
-`Example: `!set maxwidth 10000`
-
-  * Use backticks around file and directory names that contain special characters and also around reserved words when you query a file system .   
-The following special characters require backticks:
-
-    * . (period)
-    * / (forward slash)
-    * _ (underscore)
-
-Example: ``SELECT * FROM dfs.default.`sample_data/my_sample.json`; ``
-
-  * `CAST` data to `VARCHAR` if an expression in a query returns `VARBINARY` as the result type in order to view the `VARBINARY` types as readable data. If you do not use the `CAST` function, Drill returns the results as byte data.  
-Example: `CAST (VARBINARY_expr as VARCHAR(50))`
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/006-sql-ref.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/006-sql-ref.md b/_docs/drill-docs/006-sql-ref.md
deleted file mode 100644
index 8818ca3..0000000
--- a/_docs/drill-docs/006-sql-ref.md
+++ /dev/null
@@ -1,25 +0,0 @@
----
-title: "Develop Custom Functions"
-parent: "Apache Drill Documentation"
----
-Drill supports the ANSI standard for SQL. You can use SQL to query your Hive,
-HBase, and distributed file system data sources. Drill can discover the form
-of the data when you submit a query. You can query text files and nested data
-formats, such as JSON and Parquet. Drill provides special operators and
-functions that you can use to _drill down _into nested data formats.
-
-Drill queries do not require information about the data that you are trying to
-access, regardless of its source system or its schema and data types. The
-sweet spot for Apache Drill is a SQL query workload against "complex data":
-data made up of various types of records and fields, rather than data in a
-recognizable relational form (discrete rows and columns).
-
-Refer to the following SQL reference pages for more information:
-
-  * [Data Types](/confluence/display/DRILL/Data+Types)
-  * [Operators](/confluence/display/DRILL/Operators)
-  * [SQL Functions](/confluence/display/DRILL/SQL+Functions)
-  * [Nested Data Functions](/confluence/display/DRILL/Nested+Data+Functions)
-  * [SQL Commands Summary](/confluence/display/DRILL/SQL+Commands+Summary)
-  * [Reserved Keywords](/confluence/display/DRILL/Reserved+Keywords)
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/007-dev-custom-func.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/007-dev-custom-func.md b/_docs/drill-docs/007-dev-custom-func.md
deleted file mode 100644
index 9bc8e65..0000000
--- a/_docs/drill-docs/007-dev-custom-func.md
+++ /dev/null
@@ -1,47 +0,0 @@
----
-title: "Develop Custom Functions"
-parent: "Apache Drill Documentation"
----
-
-Drill provides a high performance Java API with interfaces that you can
-implement to develop simple and aggregate custom functions. Custom functions
-are reusable SQL functions that you develop in Java to encapsulate code that
-processes column values during a query. Custom functions can perform
-calculations and transformations that built-in SQL operators and functions do
-not provide. Custom functions are called from within a SQL statement, like a
-regular function, and return a single value.
-
-### Simple Function
-
-A simple function operates on a single row and produces a single row as the
-output. When you include a simple function in a query, the function is called
-once for each row in the result set. Mathematical and string functions are
-examples of simple functions.
-
-### Aggregate Function
-
-Aggregate functions differ from simple functions in the number of rows that
-they accept as input. An aggregate function operates on multiple input rows
-and produces a single row as output. The COUNT(), MAX(), SUM(), and AVG()
-functions are examples of aggregate functions. You can use an aggregate
-function in a query with a GROUP BY clause to produce a result set with a
-separate aggregate value for each combination of values from the GROUP BY
-clause.
-
-### Process
-
-To develop custom functions that you can use in your Drill queries, you must
-complete the following tasks:
-
-  1. Create a Java program that implements Drill’s simple or aggregate interface, and compile a sources and a classes JAR file.
-  2. Add the sources and classes JAR files to Drill’s classpath.
-  3. Add the name of the package that contains the classes to Drill’s main configuration file, drill-override.conf. 
-
-Click on one of the following links to learn how to create custom functions
-for Drill:
-
-  * [Developing a Simple Function](/confluence/display/DRILL/Developing+a+Simple+Function)
-  * [Developing an Aggregate Function](/confluence/display/DRILL/Developing+an+Aggregate+Function)
-  * [Adding Custom Functions to Drill](/confluence/display/DRILL/Adding+Custom+Functions+to+Drill)
-  * [Using Custom Functions in Queries](/confluence/display/DRILL/Using+Custom+Functions+in+Queries)
-  * [Custom Function Interfaces](/confluence/display/DRILL/Custom+Function+Interfaces)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/008-manage.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/008-manage.md b/_docs/drill-docs/008-manage.md
deleted file mode 100644
index e629b20..0000000
--- a/_docs/drill-docs/008-manage.md
+++ /dev/null
@@ -1,23 +0,0 @@
----
-title: "Manage Drill"
-parent: "Apache Drill Documentation"
----
-When using Drill, you may need to stop and restart a Drillbit on a node, or
-modify various options. For example, the default storage format for CTAS
-statements is Parquet. You can modify the default setting so that output data
-is stored in CSV or JSON format.
-
-You can use certain SQL commands to manage Drill from within the Drill shell
-(SQLLine). You can also modify Drill configuration options, such as memory
-allocation, in Drill's configuration files.
-
-Refer to the following documentation for information about managing Drill in
-your cluster:
-
-  * [Configuration Options](/confluence/display/DRILL/Configuration+Options)
-  * [Starting/Stopping Drill](/confluence/pages/viewpage.action?pageId=44994063)
-  * [Ports Used by Drill](/confluence/display/DRILL/Ports+Used+by+Drill)
-  * [Partition Pruning](/confluence/display/DRILL/Partition+Pruning)
-  * [Monitoring and Canceling Queries in the Drill Web UI](/confluence/display/DRILL/Monitoring+and+Canceling+Queries+in+the+Drill+Web+UI)
-  
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/009-develop.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/009-develop.md b/_docs/drill-docs/009-develop.md
deleted file mode 100644
index d95f986..0000000
--- a/_docs/drill-docs/009-develop.md
+++ /dev/null
@@ -1,16 +0,0 @@
----
-title: "Develop Drill"
-parent: "Apache Drill Documentation"
----
-To develop Drill, you compile Drill from source code and then set up a project
-in Eclipse for use as your development environment. To review or contribute to
-Drill code, you must complete the steps required to install and use the Drill
-patch review tool.
-
-For information about contributing to the Apache Drill project, you can refer
-to the following pages:
-
-  * [Compiling Drill from Source](/confluence/display/DRILL/Compiling+Drill+from+Source)
-  * [Setting Up Your Development Environment](/confluence/display/DRILL/Setting+Up+Your+Development+Environment)
-  * [Drill Patch Review Tool](/confluence/display/DRILL/Drill+Patch+Review+Tool)
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/010-rn.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/010-rn.md b/_docs/drill-docs/010-rn.md
deleted file mode 100644
index f196714..0000000
--- a/_docs/drill-docs/010-rn.md
+++ /dev/null
@@ -1,192 +0,0 @@
----
-title: "Release Notes"
-parent: "Apache Drill Documentation"
----
-## Apache Drill 0.7.0 Release Notes
-
-Apache Drill 0.7.0, the third beta release for Drill, is designed to help
-enthusiasts start working and experimenting with Drill. It also continues the
-Drill monthly release cycle as we drive towards general availability.
-
-This release is available as
-[binary](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-
-drill-0.7.0.tar.gz) and
-[source](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-
-drill-0.7.0-src.tar.gz) tarballs that are compiled against Apache Hadoop.
-Drill has been tested against MapR, Cloudera, and Hortonworks Hadoop
-distributions. There are associated build profiles and JIRAs that can help you
-run Drill against your preferred distribution
-
-Apache Drill 0.7.0 Key Features
-
-  * No more dependency on UDP/Multicast - Making it possible for Drill to work well in the following scenarios:
-
-    * UDP multicast not enabled (as in EC2)
-
-    * Cluster spans multiple subnets
-
-    * Cluster has multihome configuration
-
-  * New functions to natively work with nested data - KVGen and Flatten 
-
-  * Support for Hive 0.13 (Hive 0.12 with Drill is not supported any more) 
-
-  * Improved performance when querying Hive tables and File system through partition pruning
-
-  * Improved performance for HBase with LIKE operator pushdown
-
-  * Improved memory management
-
-  * Drill web UI monitoring and query profile improvements
-
-  * Ability to parse files without explicit extensions using default storage format specification
-
-  * Fixes for dealing with complex/nested data objects in Parquet/JSON
-
-  * Fast schema return - Improved experience working with BI/query tools by returning metadata quickly
-
-  * Several hang related fixes
-
-  * Parquet writer fixes for handling large datasets
-
-  * Stability improvements in ODBC and JDBC drivers
-
-Apache Drill 0.7.0 Key Notes and Limitations
-
-  * The current release supports in-memory and beyond-memory execution. However, you must disable memory-intensive hash aggregate and hash join operations to leverage this functionality.
-  * While the Drill execution engine supports dynamic schema changes during the course of a query, some operators have yet to implement support for this behavior, such as Sort. Other operations, such as streaming aggregate, may have partial support that leads to unexpected results.
-
-## Apache Drill 0.6.0 Release Notes
-
-Apache Drill 0.6.0, the second beta release for Drill, is designed to help
-enthusiasts start working and experimenting with Drill. It also continues the
-Drill monthly release cycle as we drive towards general availability.
-
-This release is available as [binary](http://www.apache.org/dyn/closer.cgi/inc
-ubator/drill/drill-0.5.0-incubating/apache-drill-0.5.0-incubating.tar.gz) and 
-[source](http://www.apache.org/dyn/closer.cgi/incubator/drill/drill-0.5.0-incu
-bating/apache-drill-0.5.0-incubating-src.tar.gz) tarballs that are compiled
-against Apache Hadoop. Drill has been tested against MapR, Cloudera, and
-Hortonworks Hadoop distributions. There are associated build profiles and
-JIRAs that can help you run Drill against your preferred distribution.
-
-Apache Drill 0.6.0 Key Features
-
-This release is primarily a bug fix release, with [more than 30 JIRAs closed](
-https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&vers
-ion=12327472), but there are some notable features:
-
-  * Direct ANSI SQL access to MongoDB, using the latest [MongoDB Plugin for Apache Drill](/confluence/display/DRILL/MongoDB+Plugin+for+Apache+Drill)
-  * Filesystem query performance improvements with partition pruning
-  * Ability to use the file system as a persistent store for query profiles and diagnostic information
-  * Window function support (alpha)
-
-Apache Drill 0.6.0 Key Notes and Limitations
-
-  * The current release supports in-memory and beyond-memory execution. However, you must disable memory-intensive hash aggregate and hash join operations to leverage this functionality.
-  * While the Drill execution engine supports dynamic schema changes during the course of a query, some operators have yet to implement support for this behavior, such as Sort. Other operations, such as streaming aggregate, may have partial support that leads to unexpected results.
-
-## Apache Drill 0.5.0 Release Notes
-
-Apache Drill 0.5.0, the first beta release for Drill, is designed to help
-enthusiasts start working and experimenting with Drill. It also continues the
-Drill monthly release cycle as we drive towards general availability.
-
-The 0.5.0 release is primarily a bug fix release, with [more than 100 JIRAs](h
-ttps://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&versi
-on=12324880) closed, but there are some notable features. For information
-about the features, see the [Apache Drill Blog for the 0.5.0
-release](https://blogs.apache.org/drill/entry/apache_drill_beta_release_see).
-
-This release is available as [binary](http://www.apache.org/dyn/closer.cgi/inc
-ubator/drill/drill-0.5.0-incubating/apache-drill-0.5.0-incubating.tar.gz) and 
-[source](http://www.apache.org/dyn/closer.cgi/incubator/drill/drill-0.5.0-incu
-bating/apache-drill-0.5.0-incubating-src.tar.gz) tarballs that are compiled
-against Apache Hadoop. Drill has been tested against MapR, Cloudera, and
-Hortonworks Hadoop distributions. There are associated build profiles and
-JIRAs that can help you run Drill against your preferred distribution.
-
-Apache Drill 0.5.0 Key Notes and Limitations
-
-  * The current release supports in memory and beyond memory execution. However, you must disable memory-intensive hash aggregate and hash join operations to leverage this functionality.
-  * While the Drill execution engine supports dynamic schema changes during the course of a query, some operators have yet to implement support for this behavior, such as Sort. Others operations, such as streaming aggregate, may have partial support that leads to unexpected results.
-  * There are known issues with joining text files without using an intervening view. See [DRILL-1401](https://issues.apache.org/jira/browse/DRILL-1401) for more information.
-
-## Apache Drill 0.4.0 Release Notes
-
-The 0.4.0 release is a developer preview release, designed to help enthusiasts
-start to work with and experiment with Drill. It is the first Drill release
-that provides distributed query execution.
-
-This release is built upon [more than 800
-JIRAs](https://issues.apache.org/jira/browse/DRILL/fixforversion/12324963/).
-It is a pre-beta release on the way towards Drill. As a developer snapshot,
-the release contains a large number of outstanding bugs that will make some
-use cases challenging. Feel free to consult outstanding issues [targeted for
-the 0.5.0
-release](https://issues.apache.org/jira/browse/DRILL/fixforversion/12324880/)
-to see whether your use case is affected.
-
-To read more about this release and new features introduced, please view the
-[0.4.0 announcement blog
-entry](https://blogs.apache.org/drill/entry/announcing_apache_drill_0_4).
-
-The release is available as both [binary](http://www.apache.org/dyn/closer.cgi
-/incubator/drill/drill-0.4.0-incubating/apache-drill-0.4.0-incubating.tar.gz)
-and [source](http://www.apache.org/dyn/closer.cgi/incubator/drill/drill-0.4.0-
-incubating/apache-drill-0.4.0-incubating-src.tar.gz) tarballs. In both cases,
-these are compiled against Apache Hadoop. Drill has also been tested against
-MapR, Cloudera and Hortonworks Hadoop distributions and there are associated
-build profiles or JIRAs that can help you run against your preferred
-distribution.
-
-Some Key Notes & Limitations
-
-  * The current release supports in memory and beyond memory execution. However, users must disable memory-intensive hash aggregate and hash join operations to leverage this functionality.
-  * In many cases,merge join operations return incorrect results.
-  * Use of a local filter in a join “on” clause when using left, right or full outer joins may result in incorrect results.
-  * Because of known memory leaks and memory overrun issues you may need more memory and you may need to restart the system in some cases.
-  * Some types of complex expressions, especially those involving empty arrays may fail or return incorrect results.
-  * While the Drill execution engine supports dynamic schema changes during the course of a query, some operators have yet to implement support for this behavior (such as Sort). Others operations (such as streaming aggregate) may have partial support that leads to unexpected results.
-  * Protobuf, UDF, query plan interfaces and all interfaces are subject to change in incompatible ways.
-  * Multiplication of some types of DECIMAL(28+,*) will return incorrect result.
-
-## Apache Drill M1 -- Release Notes (Apache Drill Alpha)
-
-### Milestone 1 Goals
-
-The first release of Apache Drill is designed as a technology preview for
-people to better understand the architecture and vision. It is a functional
-release tying to piece together the key components of a next generation MPP
-query engine. It is designed to allow milestone 2 (M2) to focus on
-architectural analysis and performance optimization.
-
-  * Provide a new optimistic DAG execution engine for data analysis
-  * Build a new columnar shredded in-memory format and execution model that minimizes data serialization/deserialization costs and operator complexity
-  * Provide a model for runtime generated functions and relational operators that minimizes complexity and maximizes performance
-  * Support queries against columnar on disk format (Parquet) and JSON
-  * Support the most common set of standard SQL read-only phrases using ANSI standards. Includes: SELECT, FROM, WHERE, HAVING, ORDER, GROUP BY, IN, DISTINCT, LEFT JOIN, RIGHT JOIN, INNER JOIN
-  * Support schema-on-read querying and execution
-  * Build a set of columnar operation primitives including Merge Join, Sort, Streaming Aggregate, Filter, Selection Vector removal.
-  * Support unlimited level of subqueries and correlated subqueries
-  * Provided an extensible query-language agnostic JSON-base logical data flow syntax.
-  * Support complex data type manipulation via logical plan operations
-
-### Known Issues
-
-SQL Parsing  
-Because Apache Drill is built to support late-bound changing schemas while SQL
-is statically typed, there are couple of special requirements that are
-required writing SQL queries. These are limited to the current release and
-will be correct in a future milestone release.
-
-  * All tables are exposed as a single map field that contains
-  * Drill Alpha doesn't support implicit or explicit casts outside those required above.
-  * Drill Alpha does not include, there are currently a couple of differences for how to write a query in In order to query against
-
-UDFs
-
-  * Drill currently supports simple and aggregate functions using scalar, repeated and
-  * Nested data support incomplete. Drill Alpha supports nested data structures as well repeated fields. However,
-  * asd
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/011-contribute.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/011-contribute.md b/_docs/drill-docs/011-contribute.md
deleted file mode 100644
index 282ab8a..0000000
--- a/_docs/drill-docs/011-contribute.md
+++ /dev/null
@@ -1,11 +0,0 @@
----
-title: "Contribute to Drill"
-parent: "Apache Drill Documentation"
----
-The Apache Drill community welcomes your support. Please read [Apache Drill
-Contribution Guidelines](https://cwiki.apache.org/confluence/display/DRILL/Apa
-che+Drill+Contribution+Guidelines) for information about how to contribute to
-the project. If you would like to contribute to the project and need some
-ideas for what to do, please read [Apache Drill Contribution
-Ideas](/confluence/display/DRILL/Apache+Drill+Contribution+Ideas).
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/012-sample-ds.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/012-sample-ds.md b/_docs/drill-docs/012-sample-ds.md
deleted file mode 100644
index fe63f6b..0000000
--- a/_docs/drill-docs/012-sample-ds.md
+++ /dev/null
@@ -1,11 +0,0 @@
----
-title: "Sample Datasets"
-parent: "Apache Drill Documentation"
----
-Use any of the following sample datasets provided to test Drill:
-
-  * [AOL Search](/confluence/display/DRILL/AOL+Search)
-  * [Enron Emails](/confluence/display/DRILL/Enron+Emails)
-  * [Wikipedia Edit History](/confluence/display/DRILL/Wikipedia+Edit+History)
-
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/013-design.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/013-design.md b/_docs/drill-docs/013-design.md
deleted file mode 100644
index 57d73c1..0000000
--- a/_docs/drill-docs/013-design.md
+++ /dev/null
@@ -1,14 +0,0 @@
----
-title: "Design Docs"
-parent: "Apache Drill Documentation"
----
-Review the Apache Drill design docs for early descriptions of Apache Drill
-functionality, terms, and goals, and reference the research articles to learn
-about Apache Drill's history:
-
-  * [Drill Plan Syntax](/confluence/display/DRILL/Drill+Plan+Syntax)
-  * [RPC Overview](/confluence/display/DRILL/RPC+Overview)
-  * [Query Stages](/confluence/display/DRILL/Query+Stages)
-  * [Useful Research](/confluence/display/DRILL/Useful+Research)
-  * [Value Vectors](/confluence/display/DRILL/Value+Vectors)
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/014-progress.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/014-progress.md b/_docs/drill-docs/014-progress.md
deleted file mode 100644
index 2a1538c..0000000
--- a/_docs/drill-docs/014-progress.md
+++ /dev/null
@@ -1,9 +0,0 @@
----
-title: "Progress Reports"
-parent: "Apache Drill Documentation"
----
-Review the following Apache Drill progress reports for a summary of issues,
-progression of the project, summary of mailing list discussions, and events:
-
-  * [2014 Q1 Drill Report](/confluence/display/DRILL/2014+Q1+Drill+Report)
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/015-archived-pages.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/015-archived-pages.md b/_docs/drill-docs/015-archived-pages.md
deleted file mode 100644
index b2a29c3..0000000
--- a/_docs/drill-docs/015-archived-pages.md
+++ /dev/null
@@ -1,9 +0,0 @@
----
-title: "Archived Pages"
-parent: "Apache Drill Documentation"
----
-The following pages have been archived:
-
-* How to Run Drill with Sample Data
-* Meet Apache Drill
-


Mime
View raw message