drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tshi...@apache.org
Subject [4/6] drill git commit: Added Drill docs
Date Thu, 15 Jan 2015 05:05:28 GMT
http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/design/004-research.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/design/004-research.md b/_docs/drill-docs/design/004-research.md
new file mode 100644
index 0000000..77be828
--- /dev/null
+++ b/_docs/drill-docs/design/004-research.md
@@ -0,0 +1,48 @@
+---
+title: "Useful Research"
+parent: "Design Docs"
+---
+## Drill itself
+
+  * Apache Proposal: <http://wiki.apache.org/incubator/DrillProposal>
+  * Mailing List Archive: <http://mail-archives.apache.org/mod_mbox/incubator-drill-dev/>
+  * DrQL ANTLR grammar: <https://gist.github.com/3483314>
+  * Apache Drill, Architecture outlines: <http://www.slideshare.net/jasonfrantz/drill-architecture-20120913>
+
+## Background info
+
+  * Dremel Paper: <http://research.google.com/pubs/pub36632.html>
+  * Dremel Presentation: <http://www.slideshare.net/robertlz/dremel-interactive-analysis-of-webscale-datasets>
+  * Query Language: <http://developers.google.com/bigquery/docs/query-reference>
+  * Protobuf: <http://developers.google.com/protocol-buffers/docs/proto>
+  * Dryad: <http://research.microsoft.com/en-us/projects/dryad/>
+  * SQLServer Query Plan: <http://msdn.microsoft.com/en-us/library/ms191158.aspx>
+  * CStore: <http://db.csail.mit.edu/projects/cstore/>
+  * Vertica (commercial evolution of C-Store): <http://vldb.org/pvldb/vol5/p1790_andrewlamb_vldb2012.pdf>
+  * <http://pdf.aminer.org/000/094/728/database_cracking.pdf>
+  * <http://homepages.cwi.nl/~idreos/NoDBsigmod2012.pdf>
+  * <http://db.csail.mit.edu/projects/cstore/abadiicde2007.pdf>
+  * Hive Architecture: <https://cwiki.apache.org/confluence/display/Hive/Design#Design-HiveArchitecture>
+  * Fast Response in an unreliable world: <http://research.google.com/people/jeff/latency.html>
+  * Column-Oriented Database Systems: <http://www.vldb.org/pvldb/2/vldb09-tutorial6.pdf> (SLIDES: <http://phdopen.mimuw.edu.pl/lato10/boncz_mimuw.pdf>)
+
+## OpenDremel
+
+  * OpenDremel site: <http://code.google.com/p/dremel/>
+  * Design Proposal for Drill: <http://www.slideshare.net/CamuelGilyadov/apache-drill-14071739>
+
+## Dazo (second generation OpenDremel)
+
+  * Dazo repos: <https://github.com/Dazo-org>
+  * ZeroVM (multi-tenant executor): <http://zerovm.org/>
+  * ZeroVM elaboration: <http://news.ycombinator.com/item?id=3746222>
+
+## Rob Grzywinski Dremel adventures
+
+  * <https://github.com/rgrzywinski/field-stripe/>
+
+## Code generation / Physical plan generation
+
+  * <http://www.vldb.org/pvldb/vol4/p539-neumann.pdf> (SLIDES: <http://www.vldb.org/2011/files/slides/research9/rSession9-3.pdf>)
+  * <http://www.vldb.org/pvldb/2/vldb09-327.pdf> (SLIDES: <http://www.slideserve.com/cher/simd-scan-ultra-fast-in-memory-table-scan-using-on-chip-vector-processing-units>)
+

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/design/005-value.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/design/005-value.md b/_docs/drill-docs/design/005-value.md
new file mode 100644
index 0000000..0d19a96
--- /dev/null
+++ b/_docs/drill-docs/design/005-value.md
@@ -0,0 +1,191 @@
+---
+title: "Value Vectors"
+parent: "Design Docs"
+---
+This document defines the data structures required for passing sequences of
+columnar data between [Operators](https://docs.google.com/a/maprtech.com/docum
+ent/d/1zaxkcrK9mYyfpGwX1kAV80z0PCi8abefL45zOzb97dI/edit#bookmark=id.iip15ful18
+mm).
+
+# Goals
+
+#### Support Operators Written in Multiple Language
+
+ValueVectors should support operators written in C/C++/Assembly. To support
+this, the underlying ByteBuffer will not require modification when passed
+through the JNI interface. The ValueVector will be considered immutable once
+constructed. Endianness has not yet been considered.
+
+#### Access
+
+Reading a random element from a ValueVector must be a constant time operation.
+To accomodate, elements are identified by their offset from the start of the
+buffer. Repeated, nullable and variable width ValueVectors utilize in an
+additional fixed width value vector to index each element. Write access is not
+supported once the ValueVector has been constructed by the RecordBatch.
+
+#### Efficient Subsets of Value Vectors
+
+When an operator returns a subset of values from a ValueVector, it should
+reuse the original ValueVector. To accomplish this, a level of indirection is
+introduced to skip over certain values in the vector. This level of
+indirection is a sequence of offsets which reference an offset in the original
+ValueVector and the count of subsequent values which are to be included in the
+subset.
+
+#### Pooled Allocation
+
+ValueVectors utilize one or more buffers under the covers. These buffers will
+be drawn from a pool. Value vectors are themselves created and destroyed as a
+schema changes during the course of record iteration.
+
+#### Homogenous Value Types
+
+Each value in a Value Vector is of the same type. The [Record Batch](https://d
+ocs.google.com/a/maprtech.com/document/d/1zaxkcrK9mYyfpGwX1kAV80z0PCi8abefL45z
+Ozb97dI/edit#bookmark=kix.s2xuoqnr8obe) implementation is responsible for
+creating a new Value Vector any time there is a change in schema.
+
+# Definitions
+
+Data Types
+
+The canonical source for value type definitions is the [Drill
+Datatypes](http://bit.ly/15JO9bC) document. The individual types are listed
+under the ‘Basic Data Types’ tab, while the value vector types can be found
+under the ‘Value Vectors’ tab.
+
+Operators
+
+An operator is responsible for transforming a stream of fields. It operates on
+Record Batches or constant values.
+
+Record Batch
+
+A set of field values for some range of records. The batch may be composed of
+Value Vectors, in which case each batch consists of exactly one schema.
+
+Value Vector
+
+The value vector is comprised of one or more contiguous buffers; one which
+stores a sequence of values, and zero or more which store any metadata
+associated with the ValueVector.
+
+# Data Structure
+
+A ValueVector stores values in a ByteBuf, which is a contiguous region of
+memory. Additional levels of indirection are used to support variable value
+widths, nullable values, repeated values and selection vectors. These levels
+of indirection are primarily lookup tables which consist of one or more fixed
+width ValueVectors which may be combined (e.g. for nullable, variable width
+values). A fixed width ValueVector of non-nullable, non-repeatable values does
+not require an indirect lookup; elements can be accessed directly by
+multiplying position by stride.
+
+Fixed Width Values
+
+Fixed width ValueVectors simply contain a packed sequence of values. Random
+access is supported by accessing element n at ByteBuf[0] + Index * Stride,
+where Index is 0-based. The following illustrates the underlying buffer of
+INT4 values [1 .. 6]:
+
+![image](../../img/value1.png)
+<!--https://lh5.googleusercontent.com/iobQUgeF4dyrWFeqVfhIBZKbkjrLk5sBJqYhWdzm
+IyMmmcX1pzZaeQiKZ5OzYeafxcY5IZHXDKuG_JkPwJrjxeLJITpXBbn7r5ep1V07a3JBQC0cJg4qKf
+VhzPZ0PDeh-->
+
+Nullable Values
+
+Nullable values are represented by a vector of bit values. Each bit in the
+vector corresponds to an element in the ValueVector. If the bit is not set,
+the value is NULL. Otherwise the value is retrieved from the underlying
+buffer. The following illustrates a NullableValueVector of INT4 values 2, 3
+and 6:
+
+![](../../img/value2.png)
+
+<!--![](https://lh5.googleusercontent.com/3M19t18av5cuXflB3WYHS0OJBaO-zFHD8TcNaKF0
+ua6g9h_LPnBijkGavCCwDDsbQzSoT5Glj1dgIwfhzK_xFPjPzc3w5O2NaVrbvEQgFhuOpK3yEr-
+nSyMocEjRuhGB)-->
+
+  
+
+#### Repeated Values
+
+A repeated ValueVector is used for elements which can contain multiple values
+(e.g. a JSON array). A table of offset and count pairs is used to represent
+each repeated element in the ValueVector. A count of zero means the element
+has no values (note the offset field is unused in this case). The following
+illustrates three fields; one with two values, one with no values, and one
+with a single value:
+
+![](../../img/value3.png)
+<!--![](https://lh6.googleusercontent.com/nFIJjIOPAl9zXttVURgp-xkW8v6z6F7ikN7sMREm
+58pdtfTlwdfjEUH4CHxknHexGdIeEhPHbMMzAgqMwnL99IZlR_YzAWvJaiStOO4QMtML8zLuwLvFDr
+hJKLMNc0zg)-->
+
+ValueVector Representation of the equivalent JSON:
+
+x:[1, 2]
+
+x:[ ]
+
+x:[3]
+
+Variable Width Values
+
+Variable width values are stored contiguously in a ByteBuf. Each element is
+represented by an entry in a fixed width ValueVector of offsets. The length of
+an entry is deduced by subtracting the offset of the following field. Because
+of this, the offset table will always contain one more entry than total
+elements, with the last entry pointing to the end of the buffer.
+
+  
+![](../../img/value4.png)
+<!--![](https://lh5.googleusercontent.com/ZxAfkmCVRJsKgLYO0pLbRM-
+aEjR2yyNZWfYkFSmlsod8GnM3huKHQuc6Do-Bp4U1wK-
+hF3e6vGHTiGPqhEc25YEHEuVTNqb1sBj0LdVrOlvGBzL8nywQbn8O1RlN-vrw)-->
+
+Repeated Map Vectors
+
+A repeated map vector contains one or more maps (akin to an array of objects
+in JSON). The values of each field in the map are stored contiguously within a
+ByteBuf. To access a specific record, a lookup table of count and offset pairs
+is used. This lookup table points to the first repeated field in each column,
+while the count indicates the maximum number of elements for the column. The
+following example illustrates a RepeatedMap with two records; one with two
+objects, and one with a single object:
+
+![](../../img/value5.png)
+<!--![](https://lh3.googleusercontent.com
+/l8yo_z_MbBz9C3OoGQEy1bNOrmnNbo2e0XtCUDRbdRR4mbCYK8h-
+Lz7_VlhDtbTkPQziwwyNpw3ylfEKjMKtj-D0pUah4arohs1hcnHrzoFfE-QZRwUdQmEReMdpSgIT)-->
+
+ValueVector representation of the equivalent JSON:
+
+x: [ {name:’Sam’, age:1}, {name:’Max’, age:2} ]
+
+x: [ {name:’Joe’, age:3} ]
+
+Selection Vectors
+
+A Selection Vector represents a subset of a ValueVector. It is implemented
+with a list of offsets which identify each element in the ValueVector to be
+included in the SelectionVector. In the case of a fixed width ValueVector, the
+offsets reference the underlying ByteBuf. In the case of a nullable, repeated
+or variable width ValueVector, the offset references the corresponding lookup
+table. The following illustrates a SelectionVector of INT4 (fixed width)
+values 2, 3 and 5 from the original vector of [1 .. 6]:
+
+![](../../img/value6.png)
+<!--![](https://lh5.googleusercontent.com/-hLlAaq9n-Q0_fZ_MKk3yFpXWZO7JOJLm-
+NDh_a_x2Ir5BhZDrZX0t-6e_w3K7R4gfgQIsv-sPxryTUzrJRszNpA3pEEn5V5uRCAlMtHejTpcu-
+_QFPfSTzzpdsf88OS)-->
+
+The following illustrates the same ValueVector with nullable fields:
+
+![](../../img/value7.png)
+<!--![](https://lh3.googleusercontent.com
+/cJxo5H_nsWWlKFUFxjOHHC6YI4sPyG5Fjj1gbdAT2AEo-c6cdkZelso6rYeZV4leMWMfbei_-
+rncjasvR9u4MUXgkpFpM22CUSnnkVX6ynpkcLW1Q-s5F2NgqCez1Fa_)-->
+

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/dev-custom-fcn/001-dev-simple.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/dev-custom-fcn/001-dev-simple.md b/_docs/drill-docs/dev-custom-fcn/001-dev-simple.md
new file mode 100644
index 0000000..47be7d9
--- /dev/null
+++ b/_docs/drill-docs/dev-custom-fcn/001-dev-simple.md
@@ -0,0 +1,51 @@
+---
+title: "Develop a Simple Function"
+parent: "Develop Custom Functions"
+---
+Create a class within a Java package that implements Drill’s simple interface
+into the program, and include the required information for the function type.
+Your function must include data types that Drill supports, such as int or
+BigInt. For a list of supported data types, refer to the Apache Drill SQL
+Reference.
+
+Complete the following steps to develop a simple function using Drill’s simple
+function interface:
+
+  1. Create a Maven project and add the following dependency:
+  
+		<dependency>
+		<groupId>org.apache.drill.exec</groupId>
+		<artifactId>drill-java-exec</artifactId>
+		<version>1.0.0-m2-incubating-SNAPSHOT</version>
+		</dependency>
+
+  2. Create a class that implements the `DrillSimpleFunc` interface and identify the scope as `FunctionScope.SIMPLE`.
+
+	**Example**
+	
+		@FunctionTemplate(name = "myaddints", scope = FunctionScope.SIMPLE, nulls = NullHandling.NULL_IF_NULL)
+		  public static class IntIntAdd implements DrillSimpleFunc {
+
+  3. Provide the variables used in the code in the `Param` and `Output` bit holders.
+
+	**Example**
+	
+		@Param IntHolder in1;
+		@Param IntHolder in2;
+		@Output IntHolder out;
+
+  4. Add the code that performs operations for the function in the `eval()` method.
+
+	**Example**
+	
+		public void setup(RecordBatch b) {
+		}
+		public void eval() {
+		 out.value = (int) (in1.value + in2.value);
+		}
+
+  5. Use the maven-source-plugin to compile the sources and classes JAR files. Verify that an empty `drill-module.conf` is included in the resources folder of the JARs.   
+Drill searches this module during classpath scanning. If the file is not
+included in the resources folder, you can add it to the JAR file or add it to
+`etc/drill/conf`.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/dev-custom-fcn/002-dev-aggregate.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/dev-custom-fcn/002-dev-aggregate.md b/_docs/drill-docs/dev-custom-fcn/002-dev-aggregate.md
new file mode 100644
index 0000000..fe6f406
--- /dev/null
+++ b/_docs/drill-docs/dev-custom-fcn/002-dev-aggregate.md
@@ -0,0 +1,59 @@
+---
+title: "Developing an Aggregate Function"
+parent: "Develop Custom Functions"
+---
+Create a class within a Java package that implements Drill’s aggregate
+interface into the program. Include the required information for the function.
+Your function must include data types that Drill supports, such as int or
+BigInt. For a list of supported data types, refer to the Drill SQL Reference.
+
+Complete the following steps to create an aggregate function:
+
+  1. Create a Maven project and add the following dependency:
+  
+		<dependency>
+		<groupId>org.apache.drill.exec</groupId>
+		<artifactId>drill-java-exec</artifactId>
+		<version>1.0.0-m2-incubating-SNAPSHOT</version>
+		</dependency>
+
+  2. Create a class that implements the `DrillAggFunc` interface and identify the scope as `FunctionTemplate.FunctionScope.POINT_AGGREGATE`.
+
+	**Example**
+	
+		@FunctionTemplate(name = "count", scope = FunctionTemplate.FunctionScope.POINT_AGGREGATE)
+		public static class BitCount implements DrillAggFunc{
+
+  3. Provide the variables used in the code in the `Param, Workspace, `and `Output` bit holders.
+
+	**Example**
+	
+		@Param BitHolder in;
+		@Workspace BitHolder value;
+		@Output BitHolder out;
+
+  4. Include the `setup(), add(), output(),` and `reset()` methods.
+	
+	**Example**
+		public void setup(RecordBatch b) {
+		  value = new BitHolder(); 
+		    value.value = 0;
+		}
+		 
+		@Override
+		public void add() {
+		      value.value++;
+		}
+		@Override
+		public void output() {
+		  out.value = value.value;
+		}
+		@Override
+		public void reset() {
+		 
+		    value.value = 0;
+
+  5. Use the maven-source-plugin to compile the sources and classes JAR files. Verify that an empty `drill-module.conf` is included in the resources folder of the JARs.   
+Drill searches this module during classpath scanning. If the file is not
+included in the resources folder, you can add it to the JAR file or add it to
+`etc/drill/conf`.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/dev-custom-fcn/003-add-custom.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/dev-custom-fcn/003-add-custom.md b/_docs/drill-docs/dev-custom-fcn/003-add-custom.md
new file mode 100644
index 0000000..7efcdce
--- /dev/null
+++ b/_docs/drill-docs/dev-custom-fcn/003-add-custom.md
@@ -0,0 +1,28 @@
+---
+title: "Adding Custom Functions to Drill"
+parent: "Develop Custom Functions"
+---
+After you develop your custom function and generate the sources and classes
+JAR files, add both JAR files to the Drill classpath, and include the name of
+the package that contains the classes to the main Drill configuration file.
+Restart the Drillbit on each node to refresh the configuration.
+
+To add a custom function to Drill, complete the following steps:
+
+  1. Add the sources JAR file and the classes JAR file for the custom function to the Drill classpath on all nodes running a Drillbit. To add the JAR files, copy them to `<drill installation directory>/jars/3rdparty`.
+  2. On all nodes running a Drillbit, add the name of the package that contains the classes to the main Drill configuration file in the following location:
+  
+        <drill installation directory>/conf/drill-override.conf
+
+	To add the package, add the package name to
+	`drill.logical.function.package+=`. Separate package names with a comma.
+	
+    **Example**
+		
+		drill.logical.function.package+= [“org.apache.drill.exec.expr.fn.impl","org.apache.drill.udfs”]
+
+  3. On each Drill node in the cluster, navigate to the Drill installation directory, and issue the following command to restart the Drillbit:
+  
+        <drill installation directory>/bin/drillbit.sh restart
+
+     Now you can issue queries with your custom functions to Drill.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/dev-custom-fcn/004-use-custom.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/dev-custom-fcn/004-use-custom.md b/_docs/drill-docs/dev-custom-fcn/004-use-custom.md
new file mode 100644
index 0000000..6a0245a
--- /dev/null
+++ b/_docs/drill-docs/dev-custom-fcn/004-use-custom.md
@@ -0,0 +1,55 @@
+---
+title: "Using Custom Functions in Queries"
+parent: "Develop Custom Functions"
+---
+When you issue a query with a custom function to Drill, Drill searches the
+classpath for the function that matches the request in the query. Once Drill
+locates the function for the request, Drill processes the query and applies
+the function during processing.
+
+Your Drill installation includes sample files in the Drill classpath. One
+sample file, `employee.json`, contains some fictitious employee data that you
+can query with a custom function.
+
+## Simple Function Example
+
+This example uses the `myaddints` simple function in a query on the
+`employee.json` file.
+
+If you issue the following query to Drill, you can see all of the employee
+data within the `employee.json` file:
+
+    0: jdbc:drill:zk=local> SELECT * FROM cp.`employee.json`;
+
+The query returns the following results:
+
+	| employee_id | full_name    | first_name | last_name  | position_id | position_title          |  store_id  | department_id | birth_da |
+	+-------------+------------+------------+------------+-------------+----------------+------------+---------------+----------+-----------
+	| 1101        | Steve Eurich | Steve      | Eurich     | 16          | Store Temporary Checker | 12         | 16            |
+	| 1102        | Mary Pierson | Mary       | Pierson    | 16          | Store Temporary Checker | 12         | 16            |
+	| 1103        | Leo Jones    | Leo        | Jones      | 16          | Store Temporary Checker | 12         | 16            |
+	…
+
+Since the `postion_id` and `store_id` columns contain integers, you can issue
+a query with the `myaddints` custom function on these columns to add the
+integers in the columns.
+
+The following query tells Drill to apply the `myaddints` function to the
+`position_id` and `store_id` columns in the `employee.json` file:
+
+    0: jdbc:drill:zk=local> SELECT myaddints(CAST(position_id AS int),CAST(store_id AS int)) FROM cp.`employee.json`;
+
+Since JSON files do not store information about data types, you must apply the
+`CAST` function in the query to tell Drill that the columns contain integer
+values.
+
+The query returns the following results:
+
+	+------------+
+	|   EXPR$0   |
+	+------------+
+	| 28         |
+	| 28         |
+	| 36         |
+	+------------+
+	…
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/dev-custom-fcn/005-cust-interface.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/dev-custom-fcn/005-cust-interface.md b/_docs/drill-docs/dev-custom-fcn/005-cust-interface.md
new file mode 100644
index 0000000..b84cad0
--- /dev/null
+++ b/_docs/drill-docs/dev-custom-fcn/005-cust-interface.md
@@ -0,0 +1,14 @@
+---
+title: "Custom Function Interfaces"
+parent: "Develop Custom Functions"
+---
+Implement the Drill interface appropriate for the type of function that you
+want to develop. Each interface provides a set of required holders where you
+input data types that your function uses and required methods that Drill calls
+to perform your function’s operations.
+
+Click on either of the links for more information about custom function
+interfaces for Drill:
+
+  * [Simple Function Interface](/confluence/display/DRILL/Simple+Function+Interface)
+  * [Aggregate Function Interface](/confluence/display/DRILL/Aggregate+Function+Interface)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/develop/001-compile.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/develop/001-compile.md b/_docs/drill-docs/develop/001-compile.md
new file mode 100644
index 0000000..85db854
--- /dev/null
+++ b/_docs/drill-docs/develop/001-compile.md
@@ -0,0 +1,37 @@
+---
+title: "Compiling Drill From source"
+parent: "Develop Drill"
+---
+## Prerequisites
+
+  * Maven 3.0.4 or later
+  * Oracle JDK 7 or later
+
+Run the following commands to verify that you have the correct versions of
+Maven and JDK installed:
+
+    java -version
+    mvn -version
+
+## 1\. Clone the Repository
+
+    git clone https://git-wip-us.apache.org/repos/asf/incubator-drill.git
+
+## 2\. Compile the Code
+
+    cd incubator-drill
+    mvn clean install -DskipTests
+
+## 3\. Explode the Tarball in the Installation Directory
+
+    mkdir ~/compiled-drill
+    tar xvzf distribution/target/*.tar.gz --strip=1 -C ~/compiled-drill
+
+Now that you have Drill installed, you can connect to Drill and query sample
+data or you can connect Drill to your data sources.
+
+  * To connect Drill to your data sources, refer to [Connecting to Data Sources](https://cwiki.apache.org/confluence/display/DRILL/Connecting+to+Data+Sources) for instructions.
+  * To connect to Drill and query sample data, refer to the following topics:
+    * [Start Drill ](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=44994063)(For Drill installed in embedded mode)
+    * [Query Data ](https://cwiki.apache.org/confluence/display/DRILL/Query+Data)
+

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/develop/002-setup.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/develop/002-setup.md b/_docs/drill-docs/develop/002-setup.md
new file mode 100644
index 0000000..19fb554
--- /dev/null
+++ b/_docs/drill-docs/develop/002-setup.md
@@ -0,0 +1,5 @@
+---
+title: "Setting Up Your Development Environment"
+parent: "Develop Drill"
+---
+TBD
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/develop/003-patch-tool.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/develop/003-patch-tool.md b/_docs/drill-docs/develop/003-patch-tool.md
new file mode 100644
index 0000000..5b94577
--- /dev/null
+++ b/_docs/drill-docs/develop/003-patch-tool.md
@@ -0,0 +1,160 @@
+---
+title: "Compiling Drill From source"
+parent: "Develop Drill"
+---
+  * Drill JIRA and Reviewboard script
+    * 1\. Setup
+    * 2\. Usage
+    * 3\. Upload patch
+    * 4\. Update patch
+  * JIRA command line tool
+    * 1\. Download the JIRA command line package
+    * 2\. Configure JIRA username and password
+  * Reviewboard
+    * 1\. Install the post-review tool
+    * 2\. Configure Stuff
+  * FAQ
+    * When I run the script, it throws the following error and exits
+    * When I run the script, it throws the following error and exits
+
+### Drill JIRA and Reviewboard script
+
+#### 1\. Setup
+
+  1. Follow instructions [here](https://cwiki.apache.org/confluence/display/DRILL/Drill+Patch+Review+Tool#Drillpatchreviewtool-JIRAcommandlinetool) to setup the jira-python package
+  2. Follow instructions [here](https://cwiki.apache.org/confluence/display/DRILL/Drill+Patch+Review+Tool#Drillpatchreviewtool-Reviewboard) to setup the reviewboard python tools
+  3. Install the argparse module 
+  
+        On Linux -> sudo yum install python-argparse
+        On Mac -> sudo easy_install argparse
+
+#### 2\. Usage
+
+	nnarkhed-mn: nnarkhed$ python drill-patch-review.py --help
+	usage: drill-patch-review.py [-h] -b BRANCH -j JIRA [-s SUMMARY]
+	                             [-d DESCRIPTION] [-r REVIEWBOARD] [-t TESTING]
+	                             [-v VERSION] [-db] -rbu REVIEWBOARDUSER -rbp REVIEWBOARDPASSWORD
+	 
+	Drill patch review tool
+	 
+	optional arguments:
+	  -h, --help            show this help message and exit
+	  -b BRANCH, --branch BRANCH
+	                        Tracking branch to create diff against
+	  -j JIRA, --jira JIRA  JIRA corresponding to the reviewboard
+	  -s SUMMARY, --summary SUMMARY
+	                        Summary for the reviewboard
+	  -d DESCRIPTION, --description DESCRIPTION
+	                        Description for reviewboard
+	  -r REVIEWBOARD, --rb REVIEWBOARD
+	                        Review board that needs to be updated
+	  -t TESTING, --testing-done TESTING
+	                        Text for the Testing Done section of the reviewboard
+	  -v VERSION, --version VERSION
+	                        Version of the patch
+	  -db, --debug          Enable debug mode
+	  -rbu, --reviewboard-user Reviewboard user name
+	  -rbp, --reviewboard-password Reviewboard password
+
+#### 3\. Upload patch
+
+  1. Specify the branch against which the patch should be created (-b)
+  2. Specify the corresponding JIRA (-j)
+  3. Specify an **optional** summary (-s) and description (-d) for the reviewboard
+
+Example:
+
+    python drill-patch-review.py -b origin/master -j DRILL-241 -rbu tnachen -rbp password
+
+#### 4\. Update patch
+
+  1. Specify the branch against which the patch should be created (-b)
+  2. Specify the corresponding JIRA (--jira)
+  3. Specify the rb to be updated (-r)
+  4. Specify an **optional** summary (-s) and description (-d) for the reviewboard, if you want to update it
+  5. Specify an **optional** version of the patch. This will be appended to the jira to create a file named JIRA-<version>.patch. The purpose is to be able to upload multiple patches to the JIRA. This has no bearing on the reviewboard update.
+
+Example:
+
+    python drill-patch-review.py -b origin/master -j DRILL-241 -r 14081 rbp tnachen -rbp password
+
+### JIRA command line tool
+
+#### 1\. Download the JIRA command line package
+
+Install the jira-python package.
+
+    sudo easy_install jira-python
+
+#### 2\. Configure JIRA username and password
+
+Include a jira.ini file in your $HOME directory that contains your Apache JIRA
+username and password.
+
+	nnarkhed-mn:~ nnarkhed$ cat ~/jira.ini
+	user=nehanarkhede
+	password=***********
+
+### Reviewboard
+
+This is a quick tutorial on using [Review Board](https://reviews.apache.org)
+with Drill.
+
+#### 1\. Install the post-review tool
+
+If you are on RHEL, Fedora or CentOS, follow these steps:
+
+	sudo yum install python-setuptools
+	sudo easy_install -U RBTools
+
+If you are on Mac, follow these steps:
+
+	sudo easy_install -U setuptools
+	sudo easy_install -U RBTools
+
+For other platforms, follow the [instructions](http://www.reviewboard.org/docs/manual/dev/users/tools/post-review/) to
+setup the post-review tool.
+
+#### 2\. Configure Stuff
+
+Then you need to configure a few things to make it work.
+
+First set the review board url to use. You can do this from in git:
+
+    git config reviewboard.url https://reviews.apache.org
+
+If you checked out using the git wip http url that confusingly won't work with
+review board. So you need to configure an override to use the non-http url.
+You can do this by adding a config file like this:
+
+	jkreps$ cat ~/.reviewboardrc
+	REPOSITORY = 'git://git.apache.org/incubator-drill.git'
+	TARGET_GROUPS = 'drill-git'
+GUESS_FIELDS = True
+
+
+
+### FAQ
+
+#### When I run the script, it throws the following error and exits
+
+    nnarkhed$python drill-patch-review.py -b trunk -j DRILL-241
+    There don't seem to be any diffs
+
+There are two reasons for this:
+
+  * The code is not checked into your local branch
+  * The -b branch is not pointing to the remote branch. In the example above, "trunk" is specified as the branch, which is the local branch. The correct value for the -b (--branch) option is the remote branch. "git branch -r" gives the list of the remote branch names.
+
+#### When I run the script, it throws the following error and exits
+
+Error uploading diff
+ 
+Your review request still exists, but the diff is not attached.
+
+One of the most common root causes of this error are that the git remote
+branches are not up-to-date. Since the script already does that, it is
+probably due to some other problem. You can run the script with the --debug
+option that will make post-review run in the debug mode and list the root
+cause of the issue.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/install/001-drill-in-10.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/install/001-drill-in-10.md b/_docs/drill-docs/install/001-drill-in-10.md
new file mode 100644
index 0000000..bd60141
--- /dev/null
+++ b/_docs/drill-docs/install/001-drill-in-10.md
@@ -0,0 +1,395 @@
+---
+title: "Apache Drill in 10 Minutes"
+parent: "Install Drill"
+---
+* Objective
+* A Few Bits About Apache Drill
+* Process Overview
+* Install Drill
+  * Installing Drill on Linux
+  * Installing Drill on Mac OS X
+  * Installing Drill on Windows 
+* Start Drill 
+* Query Sample Data 
+* Summary 
+* Next Steps
+* More Information
+
+  
+
+# Objective
+
+Use Apache Drill to query sample data in 10 minutes. For simplicity, you’ll
+run Drill in _embedded_ mode rather than _distributed_ mode to try out Drill
+without having to perform any setup tasks.
+
+# A Few Bits About Apache Drill
+
+Drill is a clustered, powerful MPP (Massively Parallel Processing) query
+engine for Hadoop that can process petabytes of data, fast. Drill is useful
+for short, interactive ad-hoc queries on large-scale data sets. Drill is
+capable of querying nested data in formats like JSON and Parquet and
+performing dynamic schema discovery. Drill does not require a centralized
+metadata repository.
+
+### **_Dynamic schema discovery _**
+
+Drill does not require schema or type specification for data in order to start
+the query execution process. Drill starts data processing in record-batches
+and discovers the schema during processing. Self-describing data formats such
+as Parquet, JSON, AVRO, and NoSQL databases have schema specified as part of
+the data itself, which Drill leverages dynamically at query time. Because
+schema can change over the course of a Drill query, all Drill operators are
+designed to reconfigure themselves when schemas change.
+
+### **_Flexible data model_**
+
+Drill allows access to nested data attributes, just like SQL columns, and
+provides intuitive extensions to easily operate on them. From an architectural
+point of view, Drill provides a flexible hierarchical columnar data model that
+can represent complex, highly dynamic and evolving data models. Drill allows
+for efficient processing of these models without the need to flatten or
+materialize them at design time or at execution time. Relational data in Drill
+is treated as a special or simplified case of complex/multi-structured data.
+
+### **_De-centralized metadata_**
+
+Drill does not have a centralized metadata requirement. You do not need to
+create and manage tables and views in a metadata repository, or rely on a
+database administrator group for such a function. Drill metadata is derived
+from the storage plugins that correspond to data sources. Storage plugins
+provide a spectrum of metadata ranging from full metadata (Hive), partial
+metadata (HBase), or no central metadata (files). De-centralized metadata
+means that Drill is NOT tied to a single Hive repository. You can query
+multiple Hive repositories at once and then combine the data with information
+from HBase tables or with a file in a distributed file system. You can also
+use SQL DDL syntax to create metadata within Drill, which gets organized just
+like a traditional database. Drill metadata is accessible through the ANSI
+standard INFORMATION_SCHEMA database.
+
+### **_Extensibility_**
+
+Drill provides an extensible architecture at all layers, including the storage
+plugin, query, query optimization/execution, and client API layers. You can
+customize any layer for the specific needs of an organization or you can
+extend the layer to a broader array of use cases. Drill provides a built in
+classpath scanning and plugin concept to add additional storage plugins,
+functions, and operators with minimal configuration.
+
+# Process Overview
+
+Download the Apache Drill archive and extract the contents to a directory on
+your machine. The Apache Drill archive contains sample JSON and Parquet files
+that you can query immediately.
+
+Query the sample JSON and parquet files using SQLLine. SQLLine is a pure-Java
+console-based utility for connecting to relational databases and executing SQL
+commands. SQLLine is used as the shell for Drill. Drill follows the ANSI SQL:
+2011 standard with a few extensions for nested data formats.
+
+### Prerequisite
+
+You must have the following software installed on your machine to run Drill:
+
+<div class="table-wrap"><table class="confluenceTable"><tbody><tr><td class="confluenceTd"><p><strong>Software</strong></p></td><td class="confluenceTd"><p><strong>Description</strong></p></td></tr><tr><td class="confluenceTd"><p><a href="http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html" class="external-link" rel="nofollow">Oracle JDK version 7</a></p></td><td class="confluenceTd"><p>A set of programming tools for developing Java applications.</p></td></tr></tbody></table></div>
+
+  
+### Prerequisite Validation
+
+Run the following command to verify that the system meets the software
+prerequisite:
+<table class="confluenceTable"><tbody><tr><td class="confluenceTd"><p><strong>Command </strong></p></td><td class="confluenceTd"><p><strong>Example Output</strong></p></td></tr><tr><td class="confluenceTd"><p><code>java –version</code></p></td><td class="confluenceTd"><p><code>java version &quot;1.7.0_65&quot;</code><br /><code>Java(TM) SE Runtime Environment (build 1.7.0_65-b19)</code><br /><code>Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)</code></p></td></tr></tbody></table>
+  
+# Install Drill
+
+You can install Drill on a machine running Linux, Mac OS X, or Windows.  
+
+## Installing Drill on Linux
+
+Complete the following steps to install Drill:
+
+  1. Issue the following command to download the latest, stable version of Apache Drill to a directory on your machine:
+        
+        wget http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz
+
+  2. Issue the following command to create a new directory to which you can extract the contents of the Drill `tar.gz` file:
+  
+        sudo mkdir -p /opt/drill
+
+  3. Navigate to the directory where you downloaded the Drill `tar.gz` file.  
+  
+
+  4. Issue the following command to extract the contents of the Drill `tar.gz` file:
+  
+        sudo tar -xvzf apache-drill-<version>.tar.gz -C /opt/drill
+
+  5. Issue the following command to navigate to the Drill installation directory:
+  
+        cd /opt/drill/apache-drill-<version>
+
+At this point, you can [start Drill](https://cwiki.apache.org/confluence/displ
+ay/DRILL/Apache+Drill+in+10+Minutes#ApacheDrillin10Minutes-StartDrill).
+
+## Installing Drill on Mac OS X
+
+Complete the following steps to install Drill:
+
+  1. Open a Terminal window, and create a `drill` directory inside your home directory (or in some other location if you prefer).
+
+     **Example**
+
+        $ pwd
+        /Users/max
+        $ mkdir drill
+        $ cd drill
+        $ pwd
+        /Users/max/drill
+
+  2. Click the following link to download the latest, stable version of Apache Drill:
+  
+      [http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz)
+
+  3. Open the downloaded `TAR` file with the Mac Archive utility or a similar tool for unzipping files.
+
+  4. Move the resulting `apache-drill-<version>` folder into the `drill` directory that you created.
+
+  5. Issue the following command to navigate to the `apache-drill-<version>` directory:
+  
+        cd /Users/max/drill/apache-drill-<version>
+
+At this point, you can [start Drill](https://cwiki.apache.org/confluence/displ
+ay/DRILL/Apache+Drill+in+10+Minutes#ApacheDrillin10Minutes-StartDrill).
+
+## Installing Drill on Windows
+
+You can install Drill on Windows 7 or 8. To install Drill on Windows, you must
+have JDK 7, and you must set the `JAVA_HOME` path in the Windows Environment
+Variables. You must also have a utility, such as
+[7-zip](http://www.7-zip.org/), installed on your machine. These instructions
+assume that the [7-zip](http://www.7-zip.org/) decompression utility is
+installed to extract a Drill archive file that you download.
+
+#### Setting JAVA_HOME
+
+Complete the following steps to set `JAVA_HOME`:
+
+  1. Navigate to `Control Panel\All Control Panel Items\System`, and select **Advanced System Settings**. The System Properties window appears.
+  2. On the Advanced tab, click **Environment Variables**. The Environment Variables window appears.
+  3. Add/Edit `JAVA_HOME` to point to the location where the JDK software is located.
+
+       **Example**
+       
+        C:\Program Files\Java\jdk1.7.0_65
+
+  4. Click **OK** to exit the windows.
+
+#### Installing Drill
+
+Complete the following steps to install Drill:
+
+  1. Create a `drill` directory on your `C:\` drive, (or in some other location if you prefer).
+
+       **Example**
+       
+         C:\drill
+
+     Do not include spaces in your directory path. If you include spaces in the
+directory path, Drill fails to run.
+
+  2. Click the following link to download the latest, stable version of Apache Drill:
+  
+      [http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz)
+
+  3. Move the `apache-drill-<version>.tar.gz` file to the `drill` directory that you created on your `C:\` drive.
+
+  4. Unzip the `TAR.GZ` file and the resulting `TAR` file.  
+
+    1. Right-click `apache-drill-<version>.tar.gz,` and select` 7-Zip>Extract Here`. The utility extracts the `apache-drill-<version>.tar` file.
+    2. Right-click `apache-drill-<version>.tar, `and select`` 7-Zip>Extract Here`. `The utility extracts the` apache-drill-<version> `folder.
+  5. Open the `apache-drill-<version> `folder.
+
+  6. Open the `bin` folder, and double-click on the `sqlline.bat` file. The Windows command prompt opens.
+  7. At the `sqlline>` prompt, type `!connect jdbc:drill:zk=local` and then press `Enter`.
+  8. Enter the username and password.
+    a. When prompted, enter the user name `admin` and then press Enter. 
+    b. When prompted, enter the password `admin` and then press Enter. The cursor blinks for a few seconds and then `0: jdbc:drill:zk=local> `displays in the prompt.
+
+At this point, you can submit queries to Drill. Refer to the [Query Sample Dat
+a](https://cwiki.apache.org/confluence/display/DRILL/Apache+Drill+in+10+Minute
+s#ApacheDrillin10Minutes-QuerySampleData) section of this document.
+
+# Start Drill
+
+Launch SQLLine, the Drill shell, to start and run Drill in embedded mode.
+Launching SQLLine automatically starts a new Drillbit within the shell. In a
+production environment, Drillbits are the daemon processes that run on each
+node in a Drill cluster.
+
+Complete the following steps to launch SQLLine and start Drill:
+
+  1. Verify that you are in the Drill installation directory.  
+Example: `~/apache-drill-<version>`
+
+  2. Issue the following command to launch SQLLine:
+
+        bin/sqlline -u jdbc:drill:zk=local
+
+     `-u` is a JDBC connection string that directs SQLLine to connect to Drill. It
+also starts a local Drillbit. If you are connecting to an Apache Drill
+cluster, the value of `zk=` would be a list of Zookeeper quorum nodes. For
+more information about how to run Drill in clustered mode, go to [Deploying
+Apache Drill in a Clustered Environment](/confluence/display/DRILL/Deploying+A
+pache+Drill+in+a+Clustered+Environment).
+
+When SQLLine starts, the system displays the following prompt:  
+`0: jdbc:drill:zk=local>`
+
+Issue the following command when you want to exit SQLLine:
+
+    !quit
+
+
+# Query Sample Data
+
+Your Drill installation includes a `sample-date` directory with JSON and
+Parquet files that you can query. The local file system on your machine is
+configured as the `dfs` storage plugin instance by default when you install
+Drill in embedded mode. For more information about storage plugin
+configuration, refer to [Storage Plugin Registration](https://cwiki.apache.org
+/confluence/display/DRILL/Connecting+to+Data+Sources#ConnectingtoDataSources-
+StoragePluginRegistration).
+
+Use SQL syntax to query the sample `JSON` and `Parquet` files in the `sample-
+data` directory on your local file system.
+
+### Querying a JSON File
+
+A sample JSON file, `employee.json`, contains fictitious employee data.
+
+To view the data in the `employee.json` file, submit the following SQL query
+to Drill:
+    
+    0: jdbc:drill:zk=local> SELECT * FROM cp.`employee.json`;
+
+The query returns the following results:
+
+**Example of partial output**
+
+    +-------------+------------+------------+------------+-------------+-----------+
+    | employee_id | full_name  | first_name | last_name  | position_id | position_ |
+    +-------------+------------+------------+------------+-------------+-----------+
+    | 1101        | Steve Eurich | Steve      | Eurich         | 16          | Store T |
+    | 1102        | Mary Pierson | Mary       | Pierson    | 16          | Store T |
+    | 1103        | Leo Jones  | Leo        | Jones      | 16          | Store Tem |
+    | 1104        | Nancy Beatty | Nancy      | Beatty     | 16          | Store T |
+    | 1105        | Clara McNight | Clara      | McNight    | 16          | Store  |
+    | 1106        | Marcella Isaacs | Marcella   | Isaacs     | 17          | Stor |
+    | 1107        | Charlotte Yonce | Charlotte  | Yonce      | 17          | Stor |
+    | 1108        | Benjamin Foster | Benjamin   | Foster     | 17          | Stor |
+    | 1109        | John Reed  | John       | Reed       | 17          | Store Per |
+    | 1110        | Lynn Kwiatkowski | Lynn       | Kwiatkowski | 17          | St |
+    | 1111        | Donald Vann | Donald     | Vann       | 17          | Store Pe |
+    | 1112        | William Smith | William    | Smith      | 17          | Store  |
+    | 1113        | Amy Hensley | Amy        | Hensley    | 17          | Store Pe |
+    | 1114        | Judy Owens | Judy       | Owens      | 17          | Store Per |
+    | 1115        | Frederick Castillo | Frederick  | Castillo   | 17          | S |
+    | 1116        | Phil Munoz | Phil       | Munoz      | 17          | Store Per |
+    | 1117        | Lori Lightfoot | Lori       | Lightfoot  | 17          | Store |
+    +-------------+------------+------------+------------+-------------+-----------+
+    1,155 rows selected (0.762 seconds)
+    0: jdbc:drill:zk=local>
+
+### Querying a Parquet File
+
+Query the `region.parquet` and `nation.parquet` files in the `sample-data`
+directory on your local file system.
+
+#### Region File
+
+If you followed the Apache Drill in 10 Minutes instructions to install Drill
+in embedded mode, the path to the parquet file varies between operating
+systems.
+
+**Note:** When you enter the query, include the version of Drill that you are currently running. 
+
+To view the data in the `region.parquet` file, issue the query appropriate for
+your operating system:
+
+* Linux  
+`SELECT * FROM dfs.`/opt/drill/apache-drill-<version>/sample-
+data/region.parquet`; `
+* Mac OS X  
+`SELECT * FROM dfs.`/Users/max/drill/apache-drill-<version>/sample-
+data/region.parquet`;`
+* Windows  
+`SELECT * FROM dfs.`C:\drill\apache-drill-<version>\sample-
+data\region.parquet`;`
+
+The query returns the following results:
+
+    +------------+------------+
+    |   EXPR$0   |   EXPR$1   |
+    +------------+------------+
+    | AFRICA     | lar deposits. blithely final packages cajole. regular waters ar |
+    | AMERICA    | hs use ironic, even requests. s |
+    | ASIA       | ges. thinly even pinto beans ca |
+    | EUROPE     | ly final courts cajole furiously final excuse |
+    | MIDDLE EAST | uickly special accounts cajole carefully blithely close reques |
+    +------------+------------+
+    5 rows selected (0.165 seconds)
+   0: jdbc:drill:zk=local>
+
+#### Nation File
+
+If you followed the Apache Drill in 10 Minutes instructions to install Drill
+in embedded mode, the path to the parquet file varies between operating
+systems.
+
+**Note:** When you enter the query, include the version of Drill that you are currently running. 
+
+To view the data in the `nation.parquet` file, issue the query appropriate for
+your operating system:
+
+* Linux  
+
+  ``SELECT * FROM dfs.`/opt/drill/apache-drill-<version>/sample-
+data/nation.parquet`;``
+* Mac OS X
+  
+  ``SELECT * FROM dfs.`/Users/max/drill/apache-drill-<version>/sample-
+data/nation.parquet`;``
+
+* Windows 
+ 
+  ``SELECT * FROM dfs.`C:\drill\apache-drill-<version>\sample-
+data\nation.parquet`;``
+
+The query returns the following results:
+
+# Summary
+
+Now you know a bit about Apache Drill. To summarize, you have completed the
+following tasks:
+
+  * Learned that Apache Drill supports nested data, schema-less execution, and decentralized metadata.
+  * Downloaded and installed Apache Drill.
+  * Invoked SQLLine with Drill in embedded mode.
+  * Queried the sample JSON file, `employee.json`, to view its data.
+  * Queried the sample `region.parquet` file to view its data.
+  * Queried the sample `nation.parquet` file to view its data.
+
+# Next Steps
+
+Now that you have an idea about what Drill can do, you might want to:
+
+  * [Deploy Drill in a clustered environment.](https://cwiki.apache.org/confluence/display/DRILL/Deploying+Apache+Drill+in+a+Clustered+Environment)
+  * [Configure storage plugins to connect Drill to your data sources](https://cwiki.apache.org/confluence/display/DRILL/Connecting+to+Data+Sources).
+  * Query [Hive](https://cwiki.apache.org/confluence/display/DRILL/Connecting+to+Data+Sources#ConnectingtoDataSources-QueryingHiveTables) and [HBase](https://cwiki.apache.org/confluence/display/DRILL/Connecting+to+Data+Sources#ConnectingtoDataSources-QueryingHiveTables) data.
+  * [Query Complex Data](https://cwiki.apache.org/confluence/display/DRILL/Querying+Complex+Data)
+
+  * [Query Plain Text Files](https://cwiki.apache.org/confluence/display/DRILL/Querying+Plain+Text+Files)
+
+# More Information
+
+For more information about Apache Drill, go to [Apache Drill
+Wiki](https://cwiki.apache.org/confluence/display/DRILL/Apache+Drill+Wiki).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/install/002-deploy.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/install/002-deploy.md b/_docs/drill-docs/install/002-deploy.md
new file mode 100644
index 0000000..49ba68b
--- /dev/null
+++ b/_docs/drill-docs/install/002-deploy.md
@@ -0,0 +1,102 @@
+---
+title: "Deploying Apache Drill in a Clustered Environment"
+parent: "Install Drill"
+---
+# Overview
+
+To run Drill in a clustered environment, complete the following steps:
+
+  1. Install Drill on each designated node in the cluster.
+  2. Configure a cluster ID and add Zookeeper information.
+  3. Connect Drill to your data sources. 
+  4. Start Drill.
+
+### Prerequisites
+
+Before you install Apache Drill on nodes in your cluster, you must have the
+following software and services installed:
+
+  * [Oracle JDK version 7](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html)
+  * Configured and running ZooKeeper quorum
+  * Configured and running Hadoop cluster (Recommended)
+  * DNS (Recommended)
+
+Installing Drill
+
+Complete the following steps to install Drill on designated nodes:
+
+  1. Download the Drill tarball.
+  
+        curl http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz
+
+  2. Issue the following command to create a Drill installation directory and then explode the tarball to the directory:
+  
+        mkdir /opt/drill
+        tar xzf apache-drill-<version>.tar.gz --strip=1 -C /opt/drill
+
+  3. If you are using external JAR files, edit `drill-env.sh, `located in `/opt/drill/conf/`, and define `HADOOP_HOME:`
+  
+        export HADOOP_HOME="~/hadoop/hadoop-0.20.2/"
+
+  4. In `drill-override.conf,`create a unique Drill `cluster ID`, and provide Zookeeper host names and port numbers to configure a connection to your Zookeeper quorum.
+
+    a. Edit `drill-override.conf `located in `~/drill/drill-<version>/conf/`.
+
+    b. Provide a unique `cluster-id` and the Zookeeper host names and port numbers in `zk.connect`. If you install Drill on multiple nodes, assign the same `cluster ID` to each Drill node so that all Drill nodes share the same ID. The default Zookeeper port is 2181.
+
+**Example**
+
+    drill.exec: {
+      cluster-id: "<mydrillcluster>",
+      zk.connect: "<zkhostname1>:<port>,<zkhostname2>:<port>,<zkhostname3>:<port>",
+      debug.error_on_leak: false,
+      buffer.size: 6,
+      functions: ["org.apache.drill.expr.fn.impl", "org.apache.drill.udfs"]
+    }
+
+Connecting Drill to Data Sources
+
+You can connect Drill to various types of data sources. Refer to [Connect
+Apache Drill to Data Sources](https://cwiki.apache.org/confluence/display/DRIL
+L/Connecting+to+Data+Sources) to get configuration instructions for the
+particular type of data source that you want to connect to Drill.
+
+## Starting Drill
+
+Complete the following steps to start Drill:
+
+  1. Navigate to the Drill installation directory, and issue the following command to start a Drillbit:
+  
+        bin/drillbit.sh restart
+
+  2. Issue the following command to invoke SQLLine and start Drill:
+  
+        bin/sqlline -u jdbc:drill:
+
+     When connected, the Drill prompt appears.  
+     Example:
+     
+      `0: jdbc:drill:zk=<zk1host>:<port>>`
+
+     If you cannot connect to Drill, invoke SQLLine with the ZooKeeper quorum:
+
+     `bin/sqlline -u jdbc:drill:zk=<zk1host>:<port>,<zk2host>:<port>,<zk3host>:<port>` 
+
+  3. Issue the following query to Drill to verify that all Drillbits have joined the cluster:
+  
+        0: jdbc:drill:zk=<zk1host>:<port>> select * from sys.drillbits;
+
+Drill provides a list of Drillbits that have joined.
+
+    +------------+------------+--------------+--------------------+
+    |    host        | user_port    | control_port | data_port    |
+    +------------+------------+--------------+--------------------+
+    | <host address> | <port number>| <port number>| <port number>|
+    +------------+------------+--------------+--------------------+
+
+**Example**
+
+Now you can query data with Drill. The Drill installation includes sample data
+that you can query. Refer to [Query Sample Data](https://cwiki.apache.org/conf
+luence/display/DRILL/Apache+Drill+in+10+Minutes#ApacheDrillin10Minutes-
+QuerySampleData).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/install/003-install-embedded.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/install/003-install-embedded.md b/_docs/drill-docs/install/003-install-embedded.md
new file mode 100644
index 0000000..eb4fa2a
--- /dev/null
+++ b/_docs/drill-docs/install/003-install-embedded.md
@@ -0,0 +1,30 @@
+---
+title: "Installing Drill in Embedded Mode"
+parent: "Install Drill"
+---
+Installing Drill in embedded mode installs Drill locally on your machine.
+Embedded mode is a quick, easy way to install and try Drill without having to
+perform any configuration tasks. When you install Drill in embedded mode, the
+Drillbit service is installed locally and starts automatically when you invoke
+SQLLine, the Drill shell. You can install Drill in embedded mode on a machine
+running Linux, Mac OS X, or Windows.
+
+**Prerequisite:**
+
+You must have the following software installed on your machine to run Drill:
+
+<div class="table-wrap"><table class="confluenceTable"><tbody><tr><td class="confluenceTd"><p><strong>Software</strong></p></td><td class="confluenceTd"><p><strong>Description</strong></p></td></tr><tr><td class="confluenceTd"><p><a class="external-link" href="http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html" rel="nofollow">Oracle JDK version 7</a></p></td><td class="confluenceTd"><p>A set of programming tools for developing Java applications.</p></td></tr></tbody></table></div>
+
+
+A set of programming tools for developing Java applications.  
+  
+You can run the following command to verify that the system meets the software
+prerequisite:
+
+<div class="table-wrap"><table class="confluenceTable"><tbody><tr><td class="confluenceTd"><p><strong>Command</strong></p></td><td class="confluenceTd"><p><strong>Example Output</strong></p></td></tr><tr><td class="confluenceTd"><p><code>java –version</code></p></td><td class="confluenceTd"><p><code>java version &quot;1.7.0_65&quot;</code><br /><code>Java(TM) SE Runtime Environment (build 1.7.0_65-b19)</code><br /><code>Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)</code></p></td></tr></tbody></table></div>
+
+Click on the installation link appropriate for your operating system:
+
+  * [Installing Drill on Linux](/confluence/display/DRILL/Installing+Drill+on+Linux)
+  * [Installing Drill on Mac OS X](/confluence/display/DRILL/Installing+Drill+on+Mac+OS+X)
+  * [Installing Drill on Windows](/confluence/display/DRILL/Installing+Drill+on+Windows)
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/install/004-install-distributed.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/install/004-install-distributed.md b/_docs/drill-docs/install/004-install-distributed.md
new file mode 100644
index 0000000..3d993e7
--- /dev/null
+++ b/_docs/drill-docs/install/004-install-distributed.md
@@ -0,0 +1,61 @@
+---
+title: "Installing Drill in Distributed Mode"
+parent: "Install Drill"
+---
+You can install Apache Drill in distributed mode on one or multiple nodes to
+run it in a clustered environment.
+
+To install Apache Drill in distributed mode, complete the following steps:
+
+  1. Install Drill on each designated node in the cluster.
+  2. Configure a cluster ID and add Zookeeper information.
+  3. Connect Drill to your data sources. 
+  4. Start Drill.
+
+**Prerequisites**
+
+Before you install Apache Drill on nodes in your cluster, you must have the
+following software and services installed:
+
+  * [Oracle JDK version 7](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html)
+  * Configured and running ZooKeeper quorum
+  * Configured and running Hadoop cluster (Recommended)
+  * DNS (Recommended)
+
+## Installing Drill
+
+Complete the following steps to install Drill on designated nodes:
+
+  1. Download the Drill tarball.
+  
+        curl http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz
+
+  2. Issue the following command to create a Drill installation directory and then explode the tarball to the directory:
+  
+        mkdir /opt/drill
+        tar xzf apache-drill-<version>.tar.gz --strip=1 -C /opt/drill
+
+  3. If you are using external JAR files, edit `drill-env.sh, `located in `/opt/drill/conf/`, and define `HADOOP_HOME:`
+  
+        export HADOOP_HOME="~/hadoop/hadoop-0.20.2/"
+
+  4. In `drill-override.conf,`create a unique Drill `cluster ID`, and provide Zookeeper host names and port numbers to configure a connection to your Zookeeper quorum.
+
+    a. Edit `drill-override.conf `located in `~/drill/drill-<version>/conf/`.
+
+    b. Provide a unique `cluster-id` and the Zookeeper host names and port numbers in `zk.connect`. If you install Drill on multiple nodes, assign the same `cluster ID` to each Drill node so that all Drill nodes share the same ID. The default Zookeeper port is 2181.
+
+       **Example**
+       
+         drill.exec:{
+          cluster-id: "<mydrillcluster>",
+          zk.connect: "<zkhostname1>:<port>,<zkhostname2>:<port>,<zkhostname3>:<port>",
+          debug.error_on_leak: false,
+          buffer.size: 6,
+         functions: ["org.apache.drill.expr.fn.impl", "org.apache.drill.udfs"]
+         }
+
+You can connect Drill to various types of data sources. Refer to [Connect
+Apache Drill to Data Sources](https://cwiki.apache.org/confluence/display/DRIL
+L/Connecting+to+Data+Sources) to get configuration instructions for the
+particular type of data source that you want to connect to Drill.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/install/install-embedded/001-install-linux.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/install/install-embedded/001-install-linux.md b/_docs/drill-docs/install/install-embedded/001-install-linux.md
new file mode 100644
index 0000000..4dbc3c0
--- /dev/null
+++ b/_docs/drill-docs/install/install-embedded/001-install-linux.md
@@ -0,0 +1,30 @@
+---
+title: "Installing Drill on Linux"
+parent: "Installing Drill in Embedded Mode"
+---
+Complete the following steps to install Apache Drill on a machine running
+Linux:
+
+  1. Issue the following command to download the latest, stable version of Apache Drill to a directory on your machine:
+    
+        wget http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz
+         
+
+  2. Issue the following command to create a new directory to which you can extract the contents of the Drill `tar.gz` file:
+  
+        sudo mkdir -p /opt/drill
+
+  3. Navigate to the directory where you downloaded the Drill `tar.gz` file.  
+  
+
+  4. Issue the following command to extract the contents of the Drill `tar.gz` file to the directory you created:
+  
+        sudo tar -xvzf apache-drill-<version>.tar.gz -C /opt/drill
+
+  5. Issue the following command to navigate to the Drill installation directory:
+
+        cd /opt/drill/apache-drill-<version>
+        
+At this point, you can [invoke
+SQLLine](/confluence/pages/viewpage.action?pageId=44994063#Starting
+/StoppingDrill-invokeSQLLine) to run Drill.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/install/install-embedded/002-install-mac.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/install/install-embedded/002-install-mac.md b/_docs/drill-docs/install/install-embedded/002-install-mac.md
new file mode 100644
index 0000000..f20a01d
--- /dev/null
+++ b/_docs/drill-docs/install/install-embedded/002-install-mac.md
@@ -0,0 +1,33 @@
+---
+title: "Installing Drill on Mac OS X"
+parent: "Installing Drill in Embedded Mode"
+---
+Complete the following steps to install Apache Drill on a machine running Mac
+OS X:
+
+  1. Open a Terminal window, and create a `drill` directory inside your home directory (or in some other location if you prefer).
+
+     **Example**
+
+        $ pwd
+        /Users/max
+        $ mkdir drill
+        $ cd drill
+        $ pwd
+        /Users/max/drill
+
+  2. Click the following link to download the latest, stable version of Apache Drill:
+  
+     [http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz)
+
+  3. Open the downloaded `TAR` file with the Mac Archive utility or a similar tool for unzipping files.
+
+  4. Move the resulting `apache-drill-<version>` folder into the `drill` directory that you created.
+
+  5. Issue the following command to navigate to the `apache-drill-<version>` directory:
+  
+        cd /Users/max/drill/apache-drill-<version>
+
+At this point, you can [invoke SQLLine](https://cwiki.apache.org/confluence/pa
+ges/viewpage.action?pageId=44994063#Starting/StoppingDrill-invokeSQLLine) to
+run Drill.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/install/install-embedded/003-install-win.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/install/install-embedded/003-install-win.md b/_docs/drill-docs/install/install-embedded/003-install-win.md
new file mode 100644
index 0000000..285b584
--- /dev/null
+++ b/_docs/drill-docs/install/install-embedded/003-install-win.md
@@ -0,0 +1,57 @@
+---
+title: "Installing Drill on Windows"
+parent: "Installing Drill in Embedded Mode"
+---
+You can install Drill on Windows 7 or 8. To install Drill on Windows, you must
+have JDK 7, and you must set the `JAVA_HOME` path in the Windows Environment
+Variables. You must also have a utility, such as
+[7-zip](http://www.7-zip.org/), installed on your machine. These instructions
+assume that the [7-zip](http://www.7-zip.org/) decompression utility is
+installed to extract the Drill archive file that you download.
+
+#### Setting JAVA_HOME
+
+Complete the following steps to set `JAVA_HOME`:
+
+  1. Navigate to `Control Panel\All Control Panel Items\System`, and select **Advanced System Settings**. The System Properties window appears.
+  2. On the Advanced tab, click **Environment Variables**. The Environment Variables window appears.
+  3. Add/Edit `JAVA_HOME` to point to the location where the JDK software is located.
+
+     **Example**
+     
+        C:\Program Files\Java\jdk1.7.0_65
+
+  4. Click **OK** to exit the windows.
+
+#### Installing Drill
+
+Complete the following steps to install Drill:
+
+  1. Create a `drill` directory on your `C:\` drive, (or in some other location if you prefer).
+
+     **Example**
+
+     Do not include spaces in your directory path. If you include spaces in the
+directory path, Drill fails to run.
+
+  2. Click the following link to download the latest, stable version of Apache Drill:
+  
+     [http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz)
+
+  3. Move the `apache-drill-<version>.tar.gz` file to the `drill` directory that you created on your `C:\` drive.
+
+  4. Unzip the `TAR.GZ` file and the resulting `TAR` file.  
+
+    a. Right-click `apache-drill-<version>.tar.gz,` and select` 7-Zip>Extract Here`. The utility extracts the `apache-drill-<version>.tar` file.
+    b. Right-click `apache-drill-<version>.tar, `and select`` 7-Zip>Extract Here`. `The utility extracts the` apache-drill-<version> `folder.
+  5. Open the `apache-drill-<version> `folder.
+
+  6. Open the `bin` folder, and double-click on the `sqlline.bat` file. The Windows command prompt opens.
+  7. At the `sqlline>` prompt, type `!connect jdbc:drill:zk=local` and then press `Enter`.
+  8. Enter the username and password.
+    a. When prompted, enter the user name `admin` and then press Enter. 
+    b. When prompted, enter the password `admin` and then press Enter. The cursor blinks for a few seconds and then `0: jdbc:drill:zk=local> `displays in the prompt.
+
+At this point, you can submit queries to Drill. Refer to the [Query Sample Dat
+a](https://cwiki.apache.org/confluence/display/DRILL/Apache+Drill+in+10+Minute
+s#ApacheDrillin10Minutes-QuerySampleData) section of this document.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/manage/001-conf.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/001-conf.md b/_docs/drill-docs/manage/001-conf.md
new file mode 100644
index 0000000..5b68b40
--- /dev/null
+++ b/_docs/drill-docs/manage/001-conf.md
@@ -0,0 +1,20 @@
+---
+title: "Configuration Options"
+parent: "Manage Drill"
+---
+Drill provides several configuration options that you can enable, disable, or
+modify. Modifying certain configuration options can impact Drill’s
+performance. Many of Drill's configuration options reside in the `drill-
+env.sh` and `drill-override.conf` files. Drill stores these files in the
+`/conf` directory. Drill sources` /etc/drill/conf` if it exists. Otherwise,
+Drill sources the local `<drill_installation_directory>/conf` directory.
+
+Refer to the following documentation for information about configuration
+options that you can modify:
+
+  * [Memory Allocation](/confluence/display/DRILL/Memory+Allocation)
+  * [Start-Up Options](/confluence/display/DRILL/Start-Up+Options)
+  * [Planning and Execution Options](/confluence/display/DRILL/Planning+and+Execution+Options)
+  * [Persistent Configuration Storage](/confluence/display/DRILL/Persistent+Configuration+Storage)
+  
+

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/manage/002-start-stop.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/002-start-stop.md b/_docs/drill-docs/manage/002-start-stop.md
new file mode 100644
index 0000000..c533bd3
--- /dev/null
+++ b/_docs/drill-docs/manage/002-start-stop.md
@@ -0,0 +1,45 @@
+---
+title: "Starting/Stopping Drill"
+parent: "Manage Drill"
+---
+How you start Drill depends on the installation method you followed. If you
+installed Drill in embedded mode, invoking SQLLine automatically starts a
+Drillbit locally. If you installed Drill in distributed mode on one or
+multiple nodes in a cluster, you must start the Drillbit service and then
+invoke SQLLine. Once SQLLine starts, you can issue queries to Drill.
+
+### Starting a Drillbit
+
+If you installed Drill in embedded mode, you do not need to start the
+Drillbit.
+
+To start a Drillbit, navigate to the Drill installation directory, and issue
+the following command:
+
+`bin/drillbit.sh restart`
+
+### Invoking SQLLine/Connecting to a Schema
+
+SQLLine is used as the Drill shell. SQLLine connects to relational databases
+and executes SQL commands. You invoke SQLLine for Drill in embedded or
+distributed mode. If you want to connect directly to a particular schema, you
+can indicate the schema name when you invoke SQLLine.
+
+To start SQLLine, issue the appropriate command for your Drill installation
+type:
+
+<div class="table-wrap"><table class="confluenceTable"><tbody><tr><td class="confluenceTd"><p><strong>Drill Install Type</strong></p></td><td class="confluenceTd"><p><strong>Example</strong></p></td><td class="confluenceTd"><p><strong>Command</strong></p></td></tr><tr><td class="confluenceTd"><p>Embedded</p></td><td class="confluenceTd"><p>Drill installed locally (embedded mode);</p><p>Hive with embedded metastore</p></td><td class="confluenceTd"><p>To connect without specifying a schema, navigate to the Drill installation directory and issue the following command:</p><p><code>$ bin/sqlline -u jdbc:drill:zk=local -n admin -p admin </code><span> </span></p><p>Once you are in the prompt, you can issue<code> USE &lt;schema&gt; </code>or you can use absolute notation: <code>schema.table.column.</code></p><p>To connect to a schema directly, issue the command with the schema name:</p><p><code>$ bin/sqlline -u jdbc:drill:schema=&lt;database&gt;;zk=local -n admin -p admin</code></p></td></t
 r><tr><td class="confluenceTd"><p>Distributed</p></td><td class="confluenceTd"><p>Drill installed in distributed mode;</p><p>Hive with remote metastore;</p><p>HBase</p></td><td class="confluenceTd"><p>To connect without specifying a schema, navigate to the Drill installation directory and issue the following command:</p><p><code>$ bin/sqlline -u jdbc:drill:zk=&lt;zk1host&gt;:&lt;port&gt;,&lt;zk2host&gt;:&lt;port&gt;,&lt;zk3host&gt;:&lt;port&gt; -n admin -p admin</code></p><p>Once you are in the prompt, you can issue<code> USE &lt;schema&gt; </code>or you can use absolute notation: <code>schema.table.column.</code></p><p>To connect to a schema directly, issue the command with the schema name:</p><p><code>$ bin/sqlline -u jdbc:drill:schema=&lt;database&gt;;zk=&lt;zk1host&gt;:&lt;port&gt;,&lt;zk2host&gt;:&lt;port&gt;,&lt;zk3host&gt;:&lt;port&gt; -n admin -p admin</code></p></td></tr></tbody></table></div>
+  
+When SQLLine starts, the system displays the following prompt:
+
+`0: [jdbc:drill](http://jdbcdrill):schema=<database>;zk=<zkhost>:<port>>`
+
+At this point, you can use Drill to query your data source or you can discover
+metadata.
+
+### Exiting SQLLine
+
+To exit SQLLine, issue the following command:
+
+`!quit`  
+

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/manage/003-ports.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/003-ports.md b/_docs/drill-docs/manage/003-ports.md
new file mode 100644
index 0000000..70539de
--- /dev/null
+++ b/_docs/drill-docs/manage/003-ports.md
@@ -0,0 +1,9 @@
+---
+title: "Ports Used by Drill"
+parent: "Manage Drill"
+---
+The following table provides a list of the ports that Drill uses, the port
+type, and a description of how Drill uses the port:
+
+<div class="table-wrap"><table class="confluenceTable"><tbody><tr><th class="confluenceTh">Port</th><th colspan="1" class="confluenceTh">Type</th><th class="confluenceTh">Description</th></tr><tr><td valign="top" class="confluenceTd">8047</td><td valign="top" colspan="1" class="confluenceTd">TCP</td><td valign="top" class="confluenceTd">Needed for <span style="color: rgb(34,34,34);">the Drill Web UI.</span><span style="color: rgb(34,34,34);"> </span></td></tr><tr><td valign="top" class="confluenceTd">31010</td><td valign="top" colspan="1" class="confluenceTd">TCP</td><td valign="top" class="confluenceTd">User port address. Used between nodes in a Drill cluster. <br />Needed for an external client, such as Tableau, to connect into the<br />cluster nodes. Also needed for the Drill Web UI.</td></tr><tr><td valign="top" class="confluenceTd">31011</td><td valign="top" colspan="1" class="confluenceTd">TCP</td><td valign="top" class="confluenceTd">Control port address. Used between nodes i
 n a Drill cluster. <br />Needed for multi-node installation of Apache Drill.</td></tr><tr><td valign="top" colspan="1" class="confluenceTd">31012</td><td valign="top" colspan="1" class="confluenceTd">TCP</td><td valign="top" colspan="1" class="confluenceTd">Data port address. Used between nodes in a Drill cluster. <br />Needed for multi-node installation of Apache Drill.</td></tr><tr><td valign="top" colspan="1" class="confluenceTd">46655</td><td valign="top" colspan="1" class="confluenceTd">UDP</td><td valign="top" colspan="1" class="confluenceTd">Used for JGroups and Infinispan. Needed for multi-node installation of Apache Drill.</td></tr></tbody></table></div>
+

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/manage/004-partition-prune.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/004-partition-prune.md b/_docs/drill-docs/manage/004-partition-prune.md
new file mode 100644
index 0000000..fa81034
--- /dev/null
+++ b/_docs/drill-docs/manage/004-partition-prune.md
@@ -0,0 +1,75 @@
+---
+title: "Partition Pruning"
+parent: "Manage Drill"
+---
+Partition pruning is a performance optimization that limits the number of
+files and partitions that Drill reads when querying file systems and Hive
+tables. Drill only reads a subset of the files that reside in a file system or
+a subset of the partitions in a Hive table when a query matches certain filter
+criteria.
+
+For Drill to apply partition pruning to Hive tables, you must have created the
+tables in Hive using the `PARTITION BY` clause:
+
+`CREATE TABLE <table_name> (<column_name>) PARTITION BY (<column_name>);`
+
+When you create Hive tables using the `PARTITION BY` clause, each partition of
+data is automatically split out into different directories as data is written
+to disk. For more information about Hive partitioning, refer to the [Apache
+Hive wiki](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL/#LanguageManualDDL-PartitionedTables).
+
+Typically, table data in a file system is organized by directories and
+subdirectories. Queries on table data may contain `WHERE` clause filters on
+specific directories.
+
+Drill’s query planner evaluates the filters as part of a Filter operator. If
+no partition filters are present, the underlying Scan operator reads all files
+in all directories and then sends the data to operators downstream, such as
+Filter.
+
+When partition filters are present, the query planner determines if it can
+push the filters down to the Scan such that the Scan only reads the
+directories that match the partition filters, thus reducing disk I/O.
+
+## Partition Pruning Example
+
+The /`Users/max/data/logs` directory in a file system contains subdirectories
+that span a few years.
+
+The following image shows the hierarchical structure of the `…/logs` directory
+and (sub) directories:
+
+![](../../img/54.png)
+
+The following query requests log file data for 2013 from the `…/logs`
+directory in the file system:
+
+    SELECT * FROM dfs.`/Users/max/data/logs` WHERE cust_id < 10 and dir0 = 2013 limit 2;
+
+If you run the `EXPLAIN PLAN` command for the query, you can see that the`
+…/logs` directory is filtered by the scan operator.
+
+    EXPLAIN PLAN FOR SELECT * FROM dfs.`/Users/max/data/logs` WHERE cust_id < 10 and dir0 = 2013 limit 2;
+
+The following image shows a portion of the physical plan when partition
+pruning is applied:
+
+![](../../img/21.png)
+
+## Filter Examples
+
+The following queries include examples of the types of filters eligible for
+partition pruning optimization:
+
+**Example 1: Partition filters ANDed together**
+
+    SELECT * FROM dfs.`/Users/max/data/logs` WHERE dir0 = '2014' AND dir1 = '1'
+
+**Example 2: Partition filter ANDed with regular column filter**
+
+    SELECT * FROM dfs.`/Users/max/data/logs` WHERE cust_id < 10 AND dir0 = 2013 limit 2;
+
+**Example 3: Combination of AND, OR involving partition filters**
+
+    SELECT * FROM dfs.`/Users/max/data/logs` WHERE (dir0 = '2013' AND dir1 = '1') OR (dir0 = '2014' AND dir1 = '2')
+

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/manage/005-monitor-cancel.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/005-monitor-cancel.md b/_docs/drill-docs/manage/005-monitor-cancel.md
new file mode 100644
index 0000000..6888eea
--- /dev/null
+++ b/_docs/drill-docs/manage/005-monitor-cancel.md
@@ -0,0 +1,30 @@
+---
+title: "Monitoring and Canceling Queries in the Drill Web UI"
+parent: "Manage Drill"
+---
+You can monitor and cancel queries from the Drill Web UI. To access the Drill
+Web UI, the Drillbit process must be running on the Drill node that you use to
+access the Drill Web UI.
+
+To monitor or cancel a query from the Drill Web UI, complete the following
+steps:
+
+  1. Navigate to the Drill Web UI at `<drill_node_ip_address>:8047.`  
+When you access the Drill Web UI, you see some general information about Drill
+running in your cluster, such as the nodes running the Drillbit process, the
+various ports Drill is using, and the amount of direct memory assigned to
+Drill.  
+![](../../img/7.png)
+
+  2. Select **Profiles** in the toolbar. A list of running and completed queries appears. Drill assigns a query ID to each query and lists the Foreman node. The Foreman is the Drillbit node that receives the query from the client or application. The Foreman drives the entire query.  
+![](../../img/51.png)
+
+  3. Click the **Query ID** for the query that you want to monitor or cancel. The Query and Planning window appears.  
+![](../../img/4.png)
+
+  4. Select **Edit Query**.
+  5. Click **Cancel query **to cancel the** **query. The following message appears:  
+![](../../img/46.png)
+
+  6. Optionally, you can re-run the query to see a query summary in this window.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/manage/conf/001-mem-alloc.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/conf/001-mem-alloc.md b/_docs/drill-docs/manage/conf/001-mem-alloc.md
new file mode 100644
index 0000000..4508935
--- /dev/null
+++ b/_docs/drill-docs/manage/conf/001-mem-alloc.md
@@ -0,0 +1,31 @@
+---
+title: "Memory Allocation"
+parent: "Configuration Options"
+---
+You can configure the amount of direct memory allocated to a Drillbit for
+query processing. The default limit is 8G, but Drill prefers 16G or more
+depending on the workload. The total amount of direct memory that a Drillbit
+allocates to query operations cannot exceed the limit set.
+
+Drill mainly uses Java direct memory and performs well when executing
+operations in memory instead of storing the operations on disk. Drill does not
+write to disk unless absolutely necessary, unlike MapReduce where everything
+is written to disk during each phase of a job.
+
+The JVM’s heap memory does not limit the amount of direct memory available in
+a Drillbit. The on-heap memory for Drill is only about 4-8G, which should
+suffice because Drill avoids having data sit in heap memory.
+
+#### Modifying Drillbit Memory
+
+You can modify memory for each Drillbit node in your cluster. To modify the
+memory for a Drillbit, edit the `XX:MaxDirectMemorySize` parameter in the
+Drillbit startup script located in `<drill_installation_directory>/conf/drill-
+env.sh`.
+
+**Note:** If this parameter is not set, the limit depends on the amount of available system memory.
+
+After you edit `<drill_installation_directory>/conf/drill-env.sh`, [restart
+the Drillbit
+](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=44994063)on
+the node.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/84b7b36d/_docs/drill-docs/manage/conf/002-startup-opt.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/manage/conf/002-startup-opt.md b/_docs/drill-docs/manage/conf/002-startup-opt.md
new file mode 100644
index 0000000..923139f
--- /dev/null
+++ b/_docs/drill-docs/manage/conf/002-startup-opt.md
@@ -0,0 +1,50 @@
+---
+title: "Start-Up Options"
+parent: "Configuration Options"
+---
+Drill’s start-up options reside in a HOCON configuration file format, which is
+a hybrid between a properties file and a JSON file. Drill start-up options
+consist of a group of files with a nested relationship. At the core of the
+file hierarchy is `drill-default.conf`. This file is overridden by one or more
+`drill-module.conf` files, which are overridden by the `drill-override.conf`
+file that you define.
+
+You can see the following group of files throughout the source repository in
+Drill:
+
+	common/src/main/resources/drill-default.conf
+	common/src/main/resources/drill-module.conf
+	contrib/storage-hbase/src/main/resources/drill-module.conf
+	contrib/storage-hive/core/src/main/resources/drill-module.conf
+	contrib/storage-hive/hive-exec-shade/src/main/resources/drill-module.conf
+	exec/java-exec/src/main/resources/drill-module.conf
+	distribution/src/resources/drill-override.conf
+
+These files are listed inside the associated JAR files in the Drill
+distribution tarball.
+
+Each Drill module has a set of options that Drill incorporates. Drill’s
+modular design enables you to create new storage plugins, set new operators,
+or create UDFs. You can also include additional configuration options that you
+can override as necessary.
+
+When you add a JAR file to Drill, you must include a `drill-module.conf` file
+in the root directory of the JAR file that you add. The `drill-module.conf`
+file tells Drill to scan that JAR file or associated object and include it.
+
+#### Viewing Startup Options
+
+You can run the following query to see a list of Drill’s startup options:
+
+    SELECT * FROM sys.options WHERE type='BOOT'
+
+#### Configuring Start-Up Options
+
+You can configure start-up options for each Drillbit in the `drill-
+override.conf` file located in Drill’s` /conf` directory.
+
+You may want to configure the following start-up options that control certain
+behaviors in Drill:
+
+<div class="table-wrap"><table class="confluenceTable"><tbody><tr><th class="confluenceTh">Option</th><th class="confluenceTh">Default Value</th><th class="confluenceTh">Description</th></tr><tr><td valign="top" class="confluenceTd"><p>drill.exec.sys.store.provider</p></td><td valign="top" class="confluenceTd"><p>ZooKeeper</p></td><td valign="top" class="confluenceTd"><p>Defines the persistent storage (PStore) provider. The PStore holds configuration and profile data. For more information about PStores, see <a href="https://cwiki.apache.org/confluence/display/DRILL/Persistent+Configuration+Storage" rel="nofollow">Persistent Configuration Storage</a>.</p></td></tr><tr><td valign="top" class="confluenceTd"><p>drill.exec.buffer.size</p></td><td valign="top" class="confluenceTd"><p> </p></td><td valign="top" class="confluenceTd"><p>Defines the amount of memory available, in terms of record batches, to hold data on the downstream side of an operation. Drill pushes data downstream as quic
 kly as possible to make data immediately available. This requires Drill to use memory to hold the data pending operations. When data on a downstream operation is required, that data is immediately available so Drill does not have to go over the network to process it. Providing more memory to this option increases the speed at which Drill completes a query.</p></td></tr><tr><td valign="top" class="confluenceTd"><p>drill.exec.sort.external.directories</p><p>drill.exec.sort.external.fs</p></td><td valign="top" class="confluenceTd"><p> </p></td><td valign="top" class="confluenceTd"><p>These options control spooling. The drill.exec.sort.external.directories option tells Drill which directory to use when spooling. The drill.exec.sort.external.fs option tells Drill which file system to use when spooling beyond memory files. <span style="line-height: 1.4285715;background-color: transparent;"> </span></p><p>Drill uses a spool and sort operation for beyond memory operations. The sorting opera
 tion is designed to spool to a Hadoop file system. The default Hadoop file system is a local file system in the /tmp directory. Spooling performance (both writing and reading back from it) is constrained by the file system. <span style="line-height: 1.4285715;background-color: transparent;"> </span></p><p>For MapR clusters, use MapReduce volumes or set up local volumes to use for spooling purposes. Volumes improve performance and stripe data across as many disks as possible.</p></td></tr><tr><td valign="top" colspan="1" class="confluenceTd"><p>drill.exec.debug.error_on_leak</p></td><td valign="top" colspan="1" class="confluenceTd"><p>True</p></td><td valign="top" colspan="1" class="confluenceTd"><p>Determines how Drill behaves when memory leaks occur during a query. By default, this option is enabled so that queries fail when memory leaks occur. If you disable the option, Drill issues a warning when a memory leak occurs and completes the query.</p></td></tr><tr><td valign="top" cols
 pan="1" class="confluenceTd"><p>drill.exec.zk.connect</p></td><td valign="top" colspan="1" class="confluenceTd"><p>localhost:2181</p></td><td valign="top" colspan="1" class="confluenceTd"><p>Provides Drill with the ZooKeeper quorum to use to connect to data sources. Change this setting to point to the ZooKeeper quorum that you want Drill to use. You must configure this option on each Drillbit node.</p></td></tr><tr><td valign="top" colspan="1" class="confluenceTd"><p>drill.exec.cluster-id</p></td><td valign="top" colspan="1" class="confluenceTd"><p>my_drillbit_cluster</p></td><td valign="top" colspan="1" class="confluenceTd"><p>Identifies the cluster that corresponds with the ZooKeeper quorum indicated. It also provides Drill with the name of the cluster used during UDP multicast. You must change the default cluster-id if there are multiple clusters on the same subnet. If you do not change the ID, the clusters will try to connect to each other to create one cluster.</p></td></tr></t
 body></table></div>
+


Mime
View raw message