drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject [09/11] drill git commit: reorg
Date Wed, 06 May 2015 22:46:33 GMT
http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/110-manage-drill.md
----------------------------------------------------------------------
diff --git a/_docs/110-manage-drill.md b/_docs/110-manage-drill.md
deleted file mode 100644
index 2b0265c..0000000
--- a/_docs/110-manage-drill.md
+++ /dev/null
@@ -1,6 +0,0 @@
----
-title: "Manage Drill"
----
-
-  
-

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/010-configure-drill-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/010-configure-drill-introduction.md b/_docs/configure-drill/010-configure-drill-introduction.md
new file mode 100644
index 0000000..6efe513
--- /dev/null
+++ b/_docs/configure-drill/010-configure-drill-introduction.md
@@ -0,0 +1,7 @@
+---
+title: "Configure Drill Introduction"
+parent: "Configure Drill"
+---
+When using Drill, you need to make sufficient memory available Drill and other workloads running on the cluster. You might want to modify options for performance or functionality. For example, the default storage format for CTAS
+statements is Parquet. Using a configuration option, you can modify the default setting so that output data
+is stored in CSV or JSON format. The section covers the many options you can configure and how to configure memory resources for Drill running along side other workloads. This section also includes ports used by Drill.

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/020-configuring-drill-in-a-dedicated-cluster.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/020-configuring-drill-in-a-dedicated-cluster.md b/_docs/configure-drill/020-configuring-drill-in-a-dedicated-cluster.md
new file mode 100644
index 0000000..ec619a9
--- /dev/null
+++ b/_docs/configure-drill/020-configuring-drill-in-a-dedicated-cluster.md
@@ -0,0 +1,30 @@
+---
+title: "Configuring Drill in a Dedicated Cluster"
+parent: "Configure Drill"
+---
+
+This section describes how to configure the amount of direct memory allocated to a Drillbit for query processing in a dedicated Drill cluster. When you use Drill in a cluster with other workloads, configure memory as described in section, ["Configuring Drill in a Dedicated Cluster"]({{site.baseurl}}/docs/configuring-drill-in-a-dedicated-cluster). 
+
+The default memory for a Drillbit is 8G, but Drill prefers 16G or more
+depending on the workload. The total amount of direct memory that a Drillbit
+allocates to query operations cannot exceed the limit set.
+
+Drill mainly uses Java direct memory and performs well when executing
+operations in memory instead of storing the operations on disk. Drill does not
+write to disk unless absolutely necessary, unlike MapReduce where everything
+is written to disk during each phase of a job.
+
+The JVM’s heap memory does not limit the amount of direct memory available in
+a Drillbit. The on-heap memory for Drill is only about 4-8G, which should
+suffice because Drill avoids having data sit in heap memory.
+
+## Modifying Drillbit Memory
+
+You can modify memory for each Drillbit node in your cluster. To modify the
+memory for a Drillbit, edit the `XX:MaxDirectMemorySize` parameter in the
+Drillbit startup script located in `<drill_installation_directory>/conf/drill-
+env.sh`.
+
+{% include startnote.html %}If this parameter is not set, the limit depends on the amount of available system memory.{% include endnote.html %}
+
+After you edit `<drill_installation_directory>/conf/drill-env.sh`, [restart the Drillbit]({{ site.baseurl }}/docs/starting-drill-in-distributed-mode) on the node.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/030-configuring-a-multitenant-cluster-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/030-configuring-a-multitenant-cluster-introduction.md b/_docs/configure-drill/030-configuring-a-multitenant-cluster-introduction.md
new file mode 100644
index 0000000..978d374
--- /dev/null
+++ b/_docs/configure-drill/030-configuring-a-multitenant-cluster-introduction.md
@@ -0,0 +1,22 @@
+---
+title: "Configuring a Multitenant Cluster Introduction"
+parent: "Configuring a Multitenant Cluster"
+---
+
+Drill supports multiple users sharing a Drillbit. You can also run separate Drillbits running on different nodes in the cluster.
+
+Drill typically runs along side other workloads, including the following:  
+
+* Mapreduce  
+* Yarn  
+* HBase  
+* Hive and Pig  
+* Spark  
+
+You need to plan and configure these resources for use with Drill and other workloads: 
+
+* [Memory]({{site.baseurl}}/docs/configuring-multitenant-resources)  
+* [CPU]({{site.baseurl}}/docs/configuring-multitenant-resources#how-to-manage-drill-cpu-resources)  
+* Disk  
+
+Configure, memory, queues, and parallelization when users [share a Drillbit]({{site.baseurl}}/docs/configuring-resources-for-a-shared-drillbit).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/040-configuring-a-multitenant-cluster.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/040-configuring-a-multitenant-cluster.md b/_docs/configure-drill/040-configuring-a-multitenant-cluster.md
new file mode 100644
index 0000000..964a5d4
--- /dev/null
+++ b/_docs/configure-drill/040-configuring-a-multitenant-cluster.md
@@ -0,0 +1,5 @@
+---
+title: "Configuring a Multitenant Cluster"
+parent: "Configure Drill"
+---
+

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/050-configuring-multitenant-resources.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/050-configuring-multitenant-resources.md b/_docs/configure-drill/050-configuring-multitenant-resources.md
new file mode 100644
index 0000000..9a944e8
--- /dev/null
+++ b/_docs/configure-drill/050-configuring-multitenant-resources.md
@@ -0,0 +1,80 @@
+---
+title: "Configuring Multitenant Resources"
+parent: "Configuring a Multitenant Cluster"
+---
+Drill operations are memory and CPU-intensive. You need to statically partition the cluster to designate which partition handles which workload. To configure resources for Drill in a MapR cluster, modify one or more of the following files in `/opt/mapr/conf/conf.d` that the installation process creates. 
+
+* `warden.drill-bits.conf`
+* `warden.nodemanager.conf`
+* `warden.resourcemanager.conf`
+
+Configure Drill memory by modifying `warden.drill-bits.conf` in YARN and non-YARN clusters. Configure other resources by modifying `warden.nodemanager.conf `and `warden.resourcemanager.conf `in a YARN-enabled cluster.
+
+## Configuring Drill Memory in a Mixed Cluster
+
+Add the following lines to the `warden.drill-bits.conf` file to configure memory resources for Drill:
+
+    service.heapsize.min=<some value in MB>
+    service.heapsize.max=<some value in MB>
+    service.heapsize.percent=<a whole number>
+
+The service.heapsize.percent is the percentage of memory for the service bounded by minimum and maximum values.
+
+## Configuring Drill in a YARN-enabled MapR Cluster
+
+To add Drill to a YARN-enabled cluster, change memory resources to suit your application. For example, you have 120G of available memory that you allocate to following workloads in a Yarn-enabled cluster:
+
+File system = 20G  
+HBase = 20G  
+Yarn = 20G  
+OS = 8G  
+
+If Yarn does most of the work, give Drill 20G, for example, and give Yarn 60G. If you expect a heavy query load, give Drill 60G and Drill 20G.
+
+{% include startnote.html %}Drill will execute queries within Yarn soon.{% include endnote.html %} [DRILL-142](https://issues.apache.org/jira/browse/DRILL-142)
+
+YARN consists of two main services:
+
+* ResourceManager  
+  There is at least one instance in a cluster, more if you configure high availability.  
+* NodeManager  
+  There is one instance per node. 
+
+ResourceManager and NodeManager memory in `warden.resourcemanager.conf` and
+ `warden.nodemanager.conf` are set to the following defaults. 
+
+    service.heapsize.min=64
+    service.heapsize.max=325
+    service.heapsize.percent=2
+
+Change these settings for NodeManager and ResourceManager to reconfigure the total memory required for YARN services to run. If you want to place an upper limit on memory set YARN_NODEMANAGER_HEAPSIZE or YARN_RESOURCEMANAGER_HEAPSIZE environment variable in /opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop/yarn-env.sh. The -Xmx option is not set, allowing memory on to grow as needed.
+
+### MapReduce v1 Resources
+
+The following default settings in /opt/mapr/conf/warden.conf control MapReduce v1 memory:
+
+    mr1.memory.percent=50
+    mr1.cpu.percent=50
+    mr1.disk.percent=50
+
+Modify these settings to reconfigure MapReduce v1 resources to suit your application needs, as described in section ["Resource Allocation for Jobs and Applications"](http://doc.mapr.com/display/MapR/Resource+Allocation+for+Jobs+and+Applications) of the MapR documentation. Remaining memory is given to YARN applications. 
+
+
+### MapReduce v2 and other Resources
+
+You configure memory for each service by setting three values in `warden.conf`.
+
+    service.command.<servicename>.heapsize.percent
+    service.command.<servicename>.heapsize.max
+    service.command.<servicename>.heapsize.min
+
+Configure memory for other services in the same manner, as described in [MapR documentation](http://doc.mapr.com/display/MapR/warden.%3Cservicename%3E.conf)
+
+For more information about managing memory in a MapR cluster, see the following sections in the MapR documentation:
+
+* [Memory Allocation for Nodes](http://doc.mapr.com/display/MapR40x/Memory+Allocation+for+Nodes)  
+* [Cluster Resource Allocation](http://doc.mapr.com/display/MapR40x/Cluster+Resource+Allocation)  
+* [Customizing Memory Settings for MapReduce v1](http://doc.mapr.com/display/MapR40x/Customize+Memory+Settings+for+MapReduce+v1)  
+
+## How to Manage Drill CPU Resources
+Currently, you do not manage CPU resources within Drill. [Use Linux `cgroups`](http://en.wikipedia.org/wiki/Cgroups) to manage the CPU resources.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/060-configuring-a-shared-drillbit.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/060-configuring-a-shared-drillbit.md b/_docs/configure-drill/060-configuring-a-shared-drillbit.md
new file mode 100644
index 0000000..3f83736
--- /dev/null
+++ b/_docs/configure-drill/060-configuring-a-shared-drillbit.md
@@ -0,0 +1,65 @@
+---
+title: "Configuring Resources for a Shared Drillbit"
+parent: "Configuring a Multitenant Cluster"
+---
+To manage a cluster in which multiple users share a Drillbit, you configure Drill queuing and parallelization in addition to memory, as described in the previous section.
+
+##Configuring Drill Query Queuing
+
+Set [options in sys.options]({{site.baseurl}}/docs/configuration-options-introduction/) to enable and manage query queuing, which is turned off by default. There are two types of queues: large and small. You configure a maximum number of queries that each queue allows by configuring the following options in the `sys.options` table:
+
+* exec.queue.large  
+* exec.queue.small  
+
+### Example Configuration
+
+For example, you configure the queue reserved for large queries to hold a 5-query maximum. You configure the queue reserved for small queries to hold 20 queries. Users start to run queries, and Drill receives the following query requests in this order:
+
+* Query A (blue): 1 billion records, Drill estimates 10 million rows will be processed  
+* Query B (red): 2 billion records, Drill estimates 20 million rows will be processed  
+* Query C: 1 billion records  
+* Query D: 100 records
+
+The exec.queue.threshold default is 30 million, which is the estimated rows to be processed by the query. Queries A and B are queued in the large queue. The estimated rows to be processed reaches the 30 million threshold, filling the queue to capacity. The query C request arrives and goes on the wait list, and then query D arrives. Query D is queued immediately in the small queue because of its small size, as shown in the following diagram: 
+
+![drill queuing]({{ site.baseurl }}/docs/img/queuing.png)
+
+The Drill queuing configuration in this example tends to give many users running small queries a rapid response. Users running a large query might experience some delay until an earlier-received large query returns, freeing space in the large queue to process queries that are waiting.
+
+## Controlling Parallelization
+
+By default, Drill parallelizes operations when number of records manipulated within a fragment reaches 100,000. When parallelization of operations is high, the cluster operates as fast as possible, which is fine for a single user. In a contentious multi-tenant situation, however, you need to reduce parallelization to levels based on user needs.
+
+### Parallelization Configuration Procedure
+
+To configure parallelization, configure the following options in the `sys.options` table:
+
+* `planner.width.max.per.node`  
+  The maximum degree of distribution of a query across cores and cluster nodes.
+* `planner.width.max.per.query`  
+  Same as max per node but applies to the query as executed by the entire cluster.
+
+Configure the `planner.width.max.per.node` to achieve fine grained, absolute control over parallelization. 
+
+<!-- ??For example, setting the `planner.width.max.per.query` to 60 will not accelerate Drill operations because overlapping does not occur when executing 60 queries at the same time.??
+
+### Example of Configuring Parallelization
+
+For example, the default settings parallelize 70 percent of operations up to 1,000 cores. If you have 30 cores per node in a 10-node cluster, or 300 cores, parallelization occurs on approximately 210 cores. Consequently, a single user can get 70 percent usage from a cluster and no more due to the constraints configured by the `planner.width.max.per.query`.
+
+A parallelizer in the Foreman transforms the physical plan into multiple phases. A complicated query can have multiple, major fragments. A default parallelization of 70 percent of operations allows some overlap of query phases. In the example, 210 ??for each core or major fragment to a maximum of 410??.
+
+??Drill uses pipelines, blocking/nonblocking, memory is not fungible. CPU resources are fungible. There is contention for CPUs.?? -->
+
+## Data Isolation
+
+Tenants can share data on a cluster using Drill views and impersonation. ??Link to impersonation doc.??
+
+
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/070-configuring-user-impersonation.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/070-configuring-user-impersonation.md b/_docs/configure-drill/070-configuring-user-impersonation.md
new file mode 100644
index 0000000..0aa43d8
--- /dev/null
+++ b/_docs/configure-drill/070-configuring-user-impersonation.md
@@ -0,0 +1,150 @@
+---
+title: "Configuring User Impersonation"
+parent: "Configure Drill"
+---
+Impersonation allows a service to act on behalf of a client while performing the action requested by the client. By default, user impersonation is disabled in Drill. You can configure user impersonation in the drill-override.conf file.
+ 
+When you enable impersonation, Drill executes client requests as the user logged in to the client. Drill passes the user credentials to the file system, and the file system checks to see if the user has permission to access the data. When you enable authentication, Drill uses the pluggable authentication module (PAM) to authenticate a user’s identity before the user can access the Drillbit process. See User Authentication.
+ 
+If impersonation is not configured, Drill executes all of the client requests against the file system as the user that started the Drillbit service on the node. This is typically a privileged user. The file system verifies that the system user has permission to access the data.
+
+
+## Example
+When impersonation is disabled and user Bob issues a query through the SQLLine client, SQLLine passes the query to the connecting Drillbit. The Drillbit executes the query as the system user that started the Drill process on the node. For the purpose of this example, we will assume that the system user has full access to the file system. Drill executes the query and returns the results back to the client.
+![](http://i.imgur.com/4XxQK2I.png)
+
+When impersonation is enabled and user Bob issues a query through the SQLLine client, the Drillbit executes the query against the file system as Bob. The file system checks to see if Bob has permission to access the data. If so, Drill returns the query results to the client. If Bob does not have permission, Drill returns an error.
+![](http://i.imgur.com/oigWqVg.png)
+
+## Impersonation Support
+The following table lists the clients, storage plugins, and types of queries that you can use with impersonation in Drill:
+
+<table>
+  <tr>
+    <th>Type</th>
+    <th>Supported</th>
+    <th>Not Supported</th>
+  </tr>
+  <tr>
+    <td>Clients</td>
+    <td>SQLLine ODBC JDBC</td>
+    <td>Drill Web UI REST API</td>
+  </tr>
+  <tr>
+    <td>Storage Plugins</td>
+    <td>File System</td>
+    <td>Hive HBase</td>
+  </tr>
+  <tr>
+    <td>Queries</td>
+    <td>When you enable impersonation, the setting applies to queries on data and metadata. For example, if you issue the SHOW SCHEMAS command, Drill impersonates the user logged into the client to access the requested metadata. If you issue a SELECT query on a workspace, Drill impersonates the user logged in to the client to access the requested data. Drill applies impersonation to queries issued using the following commands: <br>SHOW SCHEMAS <br>SHOW DATABASES<br> SHOW TABLES<br> CTAS<br> SELECT<br> CREATE VIEW<br> DROP VIEW<br> SHOW FILES<br> To successfully run the CTAS and CREATE VIEW commands, a user must have write permissions on the directory where the table or view will exist. Running these commands creates artifacts on the file system.</td>
+    <td></td>
+  </tr>
+</table>
+
+## Impersonation and Views
+You can use views with impersonation to provide granular access to data and protect sensitive information. When you create a view, Drill stores the view definition in a file and suffixes the file with .drill.view. For example, if you create a view named myview, Drill creates a view file named myview.drill.view and saves it in the current workspace or the workspace specified, such as dfs.views.myview. See [CREATE VIEW]({{site.baseurl}}/docs/create-view-command/) Command.
+
+You can create a view and grant read permissions on the view to give other users access to the data that the view references. When a user queries the view, Drill impersonates the view owner to access the underlying data. A user with read access to a view can create new views from the originating view to further restrict access on data.
+
+### View Permissions
+A user must have write permission on a directory or workspace to create a view, as well as read access on the table(s) and/or view(s) that the view references. When a user creates a view, permission on the view is set to owner by default. Users can query an existing view or create new views from the view if they have read permissions on the view file and the directory or workspace where the view file is stored. 
+
+When users query a view, Drill accesses the underlying data as the user that created the view. If a user does not have permission to access a view, the query fails and Drill returns an error. Only the view owner or a superuser can modify view permissions to change them from owner to group or world. 
+ 
+The view owner or a superuser can modify permissions on the view file directly or they can set view permissions at the system or session level prior to creating any views. Any user that alters view permissions must have write access on the directory or workspace in which they are working. See Modifying Permissions on a View File and Modifying SYSTEM|SESSION Level View Permissions. 
+
+#### Modifying Permissions on a View File
+Only a view owner or a super user can modify permissions on a view file to change them from owner to group or world readable. Before you grant permission to users to access a view, verify that they have access to the directory or workspace in which the view file is stored.
+
+Use the `chmod` and `chown` commands with the appropriate octal code to change permissions on a view file:
+
+
+    hadoop fs –chmod <octal code> <file_name>
+    hadoop fs –chown <user>:<group> <file_name>
+Example: `hadoop fs –chmod 750 employees.drill.view`
+
+####Modifying SYSTEM|SESSION Level View Permissions
+Use the `ALTER SESSION|SYSTEM` command with the `new_view_default_permissions` parameter and the appropriate octal code to set view permissions at the system or session level prior to creating a view.
+ 
+    ALTER SESSION SET `new_view_default_permissions` = '<octal_code>';
+    ALTER SYSTEM SET `new_view_default_permissions` = '<octal_code>';
+ 
+Example: ``ALTER SESSION SET `new_view_default_permissions` = '777';``
+ 
+After you set this parameter, Drill applies the same permissions on each view created during the session or across all sessions if set at the system level.
+
+## Chained Impersonation
+You can configure Drill to allow chained impersonation on views when you enable impersonation in the `drill-override.conf` file. Chained impersonation controls the number of identity transitions that Drill can make when a user queries a view. Each identity transition is equal to one hop.
+ 
+You can set the maximum number of hops on views to limit the number of times that Drill can impersonate a different user when a user queries a view. The default maximum number of hops is set at 3. When the maximum number of hops is set to 0, Drill does not allow impersonation chaining, and a user can only read data for which they have direct permission to access. You may set chain length to 0 to protect highly sensitive data. 
+ 
+The following example depicts a scenario where the maximum hop number is set to 3, and Drill must impersonate three users to access data when Chad queries a view that Jane created:
+
+![](http://i.imgur.com/wwpStcs.png)
+
+In the previous example, Joe created a view V3 from views that user Frank created. In the following example, Joe created view V3 by joining a view that Frank created with a view that Bob created, thus increasing the number of identity transitions that Drill makes from 3 to 4, which exceeds the maximum hop setting of 3.
+ 
+In this scenario, when Chad queries Jane’s view Drill returns an error stating that the query cannot complete because the number of hops required to access the data exceeds the maximum hop setting of 3 that is configured.
+
+![](http://i.imgur.com/xO2yIDN.png)
+
+If users encounter this error, you can increase the maximum hop setting to accommodate users running queries on views. When configuring the maximum number of hops that Drill can make, consider that joined views increase the number of identity transitions required for Drill to access the underlying data.
+
+#### Configuring Impersonation and Chaining
+Chaining is a system-wide setting that applies to all views. Currently, Drill does not provide an option to  allow different chain lengths for different views.
+
+Complete the following steps on each Drillbit node to enable user impersonation, and set the maximum number of chained user hops that Drill allows:
+
+1. Navigate to `<drill_installation_directory>/conf/` and edit `drill-override.conf`.
+2. Under `drill.exe`, add the following:
+
+          drill.exec.impersonation: {
+                enabled: true,
+                 max_chained_user_hops: 3
+          }
+
+3. Verify that enabled is set to `‘true’`.
+4. Set the maximum number of chained user hops that you want Drill to allow.
+5. (MapR cluster only) Add one of the following lines to the `drill-env.sh` file:
+   * If the underlying file system is not secure, add the following line:
+   ` export MAPR_IMPERSONATION_ENABLED=true`
+   * If the underlying file system has MapR security enabled, add the following line:
+    `export MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket`
+6. Restart the Drillbit process on each Drill node.
+   * In a MapR cluster, run the following command:
+    `maprcli node services -name drill-bits -action restart -nodes <hostname> -f`
+   * In a non-MapR environment, run the following command:  
+     <DRILLINSTALL_HOME>/bin/drillbit.sh restart
+
+
+## Impersonation and Chaining Example
+Frank is a senior HR manager at a company. Frank has access to all of the employee data because he is a member of the hr group. Frank created a table named “employees” in his home directory to store the employee data he uses. Only Frank has access to this table.
+ 
+drwx------      frank:hr     /user/frank/employees
+ 
+Each record in the employees table consists of the following information:
+emp_id, emp_name, emp_ssn, emp_salary, emp_addr, emp_phone, emp_mgr
+ 
+Frank needs to share a subset of this information with Joe who is an HR manager reporting to Frank. To share the employee data, Frank creates a view called emp_mgr_view that accesses a subset of the data. The emp_mgr_view filters out sensitive employee information, such as the employee social security numbers, and only shows data for the employees that report directly to Joe or the manager running the query on the view. Frank and Joe both belong to the mgr group. Managers have read permission on Frank’s directory.
+ 
+rwxr-----     frank:mgr   /user/frank/emp_mgr_view.drill.view
+ 
+The emp_mgr_view.drill.view file contains the following view definition:
+(view definition: SELECT emp_id, emp_name, emp_salary, emp_addr, emp_phone FROM \`/user/frank/employee\` WHERE emp_mgr = user())
+ 
+When Joe issues SELECT * FROM emp_mgr_view, Drill impersonates Frank when accessing the employee data, and the query returns the data that Joe has permission to see based on the view definition. The query results do not include any sensitive data because the view protects that information. If Joe tries to query the employees table directly, Drill returns an error or null values.
+ 
+Because Joe has read permissions on the emp_mgr_view, he can create new views from it to give other users access to the employee data even though he does not own the employees table and cannot access the employees table directly.
+ 
+Joe needs to share employee contact data with his direct reports, so he creates a special view called emp_team_view to share the employee contact information with his team. Joe creates the view and writes it to his home directory. Joe and his reports belong to a group named joeteam. The joeteam group has read permissions on Joe’s home directory so they can query the view and create new views from it.
+ 
+rwxr-----     joe:joeteam   /user/joe/emp_team_view.drill.view
+ 
+The emp_team_view.drill.view file contains the following view definition:
+ 
+(view definition: SELECT emp_id, emp_name, emp_phone FROM `/user/frank/emp_mgr_view.drill`);
+ 
+When anyone on Joe’s team issues SELECT * FROM emp_team_view, Drill impersonates Joe to access the emp_team_view and then impersonates Frank to access the emp_mgr_view and the employee data. Drill returns the data that Joe’s team has can see based on the view definition. If anyone on Joe’s team tries to query the emp_mgr_view or employees table directly, Drill returns an error or null values.
+ 
+Because Joe’s team has read permissions on the emp_team_view, they can create new views from it and write the views to any directory for which they have write access. Creating views can continue until Drill reaches the maximum number of impersonation hops.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/080-configuration-options.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/080-configuration-options.md b/_docs/configure-drill/080-configuration-options.md
new file mode 100644
index 0000000..f780ca8
--- /dev/null
+++ b/_docs/configure-drill/080-configuration-options.md
@@ -0,0 +1,9 @@
+---
+title: "Configuration Options"
+parent: "Configure Drill"
+---
+
+
+
+  
+

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/100-ports-used-by-drill.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/100-ports-used-by-drill.md b/_docs/configure-drill/100-ports-used-by-drill.md
new file mode 100644
index 0000000..340b6cb
--- /dev/null
+++ b/_docs/configure-drill/100-ports-used-by-drill.md
@@ -0,0 +1,15 @@
+---
+title: "Ports Used by Drill"
+parent: "Configure Drill"
+---
+The following table provides a list of the ports that Drill uses, the port
+type, and a description of how Drill uses the port:
+
+| Port  | Type | Description                                                                                                                                                                   |
+|-------|------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| 8047  | TCP  | Needed for the Drill Web UI.                                                                                                                                                  |
+| 31010 | TCP  | User port address. Used between nodes in a Drill cluster. Needed for an external client, such as Tableau, to connect into thecluster nodes. Also needed for the Drill Web UI. |
+| 31011 | TCP  | Control port address. Used between nodes in a Drill cluster. Needed for multi-node installation of Apache Drill.                                                              |
+| 31012 | TCP  | Data port address. Used between nodes in a Drill cluster. Needed for multi-node installation of Apache Drill.                                                                 |
+| 46655 | UDP  | Used for JGroups and Infinispan. Needed for multi-node installation of Apache Drill.                                                                                          |
+

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/110-partition-pruning.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/110-partition-pruning.md b/_docs/configure-drill/110-partition-pruning.md
new file mode 100644
index 0000000..09dc626
--- /dev/null
+++ b/_docs/configure-drill/110-partition-pruning.md
@@ -0,0 +1,75 @@
+---
+title: "Partition Pruning"
+parent: "Configure Drill"
+---
+Partition pruning is a performance optimization that limits the number of
+files and partitions that Drill reads when querying file systems and Hive
+tables. Drill only reads a subset of the files that reside in a file system or
+a subset of the partitions in a Hive table when a query matches certain filter
+criteria.
+
+For Drill to apply partition pruning to Hive tables, you must have created the
+tables in Hive using the `PARTITION BY` clause:
+
+`CREATE TABLE <table_name> (<column_name>) PARTITION BY (<column_name>);`
+
+When you create Hive tables using the `PARTITION BY` clause, each partition of
+data is automatically split out into different directories as data is written
+to disk. For more information about Hive partitioning, refer to the [Apache
+Hive wiki](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL/#LanguageManualDDL-PartitionedTables).
+
+Typically, table data in a file system is organized by directories and
+subdirectories. Queries on table data may contain `WHERE` clause filters on
+specific directories.
+
+Drill’s query planner evaluates the filters as part of a Filter operator. If
+no partition filters are present, the underlying Scan operator reads all files
+in all directories and then sends the data to operators downstream, such as
+Filter.
+
+When partition filters are present, the query planner determines if it can
+push the filters down to the Scan such that the Scan only reads the
+directories that match the partition filters, thus reducing disk I/O.
+
+## Partition Pruning Example
+
+The /`Users/max/data/logs` directory in a file system contains subdirectories
+that span a few years.
+
+The following image shows the hierarchical structure of the `…/logs` directory
+and (sub) directories:
+
+![drill query flow]({{ site.baseurl }}/docs/img/54.png)
+
+The following query requests log file data for 2013 from the `…/logs`
+directory in the file system:
+
+    SELECT * FROM dfs.`/Users/max/data/logs` WHERE cust_id < 10 and dir0 = 2013 limit 2;
+
+If you run the `EXPLAIN PLAN` command for the query, you can see that the`
+…/logs` directory is filtered by the scan operator.
+
+    EXPLAIN PLAN FOR SELECT * FROM dfs.`/Users/max/data/logs` WHERE cust_id < 10 and dir0 = 2013 limit 2;
+
+The following image shows a portion of the physical plan when partition
+pruning is applied:
+
+![drill query flow]({{ site.baseurl }}/docs/img/21.png)
+
+## Filter Examples
+
+The following queries include examples of the types of filters eligible for
+partition pruning optimization:
+
+**Example 1: Partition filters ANDed together**
+
+    SELECT * FROM dfs.`/Users/max/data/logs` WHERE dir0 = '2014' AND dir1 = '1'
+
+**Example 2: Partition filter ANDed with regular column filter**
+
+    SELECT * FROM dfs.`/Users/max/data/logs` WHERE cust_id < 10 AND dir0 = 2013 limit 2;
+
+**Example 3: Combination of AND, OR involving partition filters**
+
+    SELECT * FROM dfs.`/Users/max/data/logs` WHERE (dir0 = '2013' AND dir1 = '1') OR (dir0 = '2014' AND dir1 = '2')
+

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md b/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md
new file mode 100644
index 0000000..0298006
--- /dev/null
+++ b/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md
@@ -0,0 +1,407 @@
+---
+title: "Configuration Options Introduction"
+parent: "Configuration Options"
+---
+Drill provides many configuration options that you can enable, disable, or
+modify. Modifying certain configuration options can impact Drill’s
+performance. Many of Drill's configuration options reside in the `drill-
+env.sh` and `drill-override.conf` files. Drill stores these files in the
+`/conf` directory. Drill sources` /etc/drill/conf` if it exists. Otherwise,
+Drill sources the local `<drill_installation_directory>/conf` directory.
+
+The sys.options table in Drill contains information about boot (start-up) and system options listed in the tables on this page. 
+
+## Boot Options
+The section, ["Start-up Options"]({{site.baseurl}}/docs/start-up-options), covers how to configure and view these options. 
+
+<table>
+  <tr>
+    <th>Name</th>
+    <th>Default</th>
+    <th>Comments</th>
+  </tr>
+  <tr>
+    <td>drill.exec.buffer.impl</td>
+    <td>"org.apache.drill.exec.work.batch.UnlimitedRawBatchBuffer"</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>drill.exec.buffer.size</td>
+    <td>6</td>
+    <td>Available memory in terms of record batches to hold data downstream of an operation. Increase this value to increase query speed.</td>
+  </tr>
+  <tr>
+    <td>drill.exec.compile.debug</td>
+    <td>TRUE</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>drill.exec.http.enabled</td>
+    <td>TRUE</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>drill.exec.operator.packages</td>
+    <td>"org.apache.drill.exec.physical.config"</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>drill.exec.sort.external.batch.size</td>
+    <td>4000</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>drill.exec.sort.external.spill.directories</td>
+    <td>"/tmp/drill/spill"</td>
+    <td>Determines which directory to use for spooling</td>
+  </tr>
+  <tr>
+    <td>drill.exec.sort.external.spill.group.size</td>
+    <td>100</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>drill.exec.storage.file.text.batch.size</td>
+    <td>4000</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>drill.exec.storage.packages</td>
+    <td>"org.apache.drill.exec.store" "org.apache.drill.exec.store.mock"</td>
+    <td>Ignore or include this module, including supplementary configuraiton information when scanning the class path scanning. This file is in [HOCON format](https://github.com/typesafehub/config/blob/master/HOCON.md).</td>
+  </tr>
+  <tr>
+    <td>drill.exec.sys.store.provider.class</td>
+    <td>ZooKeeper: "org.apache.drill.exec.store.sys.zk.ZkPStoreProvider"</td>
+    <td>The Pstore (Persistent Configuration Storage) provider to use. The Pstore holds configuration and profile data.</td>
+  </tr>
+  <tr>
+    <td>drill.exec.zk.connect</td>
+    <td>"localhost:2181"</td>
+    <td>The ZooKeeper quorum that Drill uses to connect to data sources. Configure on each Drillbit node.</td>
+  </tr>
+  <tr>
+    <td>drill.exec.zk.refresh</td>
+    <td>500</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>file.separator</td>
+    <td>"/"</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>java.specification.version</td>
+    <td>1.7</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>java.vm.name</td>
+    <td>"Java HotSpot(TM) 64-Bit Server VM"</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>java.vm.specification.version</td>
+    <td>1.7</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>log.path</td>
+    <td>"/log/sqlline.log"</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>sun.boot.library.path</td>
+    <td>/Library/Java/JavaVirtualMachines/jdk1.7.0_71.jdk/Contents/Home/jre/lib</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>sun.java.command</td>
+    <td>"sqlline.SqlLine -d org.apache.drill.jdbc.Driver --maxWidth=10000 -u jdbc:drill:zk=local"</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>sun.os.patch.level</td>
+    <td>unknown</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>user</td>
+    <td>""</td>
+    <td></td>
+  </tr>
+</table>
+
+## System Options
+The sys.options table lists the following options that you can set at the session or system level as described in the section, ["Planning and Execution Options"]({{site.baseurl}}/docs/planning-and-execution-options) 
+
+<table>
+  <tr>
+    <th>Name</th>
+    <th>Default</th>
+    <th>Comments</th>
+  </tr>
+  <tr>
+    <td>drill.exec.functions.cast_empty_string_to_null</td>
+    <td>FALSE</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>drill.exec.storage.file.partition.column.label</td>
+    <td>dir</td>
+    <td>Accepts a string input.</td>
+  </tr>
+  <tr>
+    <td>exec.errors.verbose</td>
+    <td>FALSE</td>
+    <td>Toggles verbose output of executable error messages</td>
+  </tr>
+  <tr>
+    <td>exec.java_compiler</td>
+    <td>DEFAULT</td>
+    <td>Switches between DEFAULT, JDK, and JANINO mode for the current session. Uses Janino by default for generated source code of less than exec.java_compiler_janino_maxsize; otherwise, switches to the JDK compiler.</td>
+  </tr>
+  <tr>
+    <td>exec.java_compiler_debug</td>
+    <td>TRUE</td>
+    <td>Toggles the output of debug-level compiler error messages in runtime generated code.</td>
+  </tr>
+  <tr>
+    <td>exec.java_compiler_janino_maxsize</td>
+    <td>262144</td>
+    <td>See the exec.java_compiler option comment. Accepts inputs of type LONG.</td>
+  </tr>
+  <tr>
+    <td>exec.max_hash_table_size</td>
+    <td>1073741824</td>
+    <td>Ending size for hash tables. Range: 0 - 1073741824</td>
+  </tr>
+  <tr>
+    <td>exec.min_hash_table_size</td>
+    <td>65536</td>
+    <td>Starting size for hash tables. Increase according to available memory to improve performance. Range: 0 - 1073741824</td>
+  </tr>
+  <tr>
+    <td>exec.queue.enable</td>
+    <td>FALSE</td>
+    <td>Changes the state of query queues to control the number of queries that run simultaneously.</td>
+  </tr>
+  <tr>
+    <td>exec.queue.large</td>
+    <td>10</td>
+    <td>Range: 0-1000</td>
+  </tr>
+  <tr>
+    <td>exec.queue.small</td>
+    <td>100</td>
+    <td>Range: 0-1001</td>
+  </tr>
+  <tr>
+    <td>exec.queue.threshold</td>
+    <td>30000000</td>
+    <td>Range: 0-9223372036854775807</td>
+  </tr>
+  <tr>
+    <td>exec.queue.timeout_millis</td>
+    <td>300000</td>
+    <td>Range: 0-9223372036854775807</td>
+  </tr>
+  <tr>
+    <td>planner.add_producer_consumer</td>
+    <td>FALSE</td>
+    <td>Increase prefetching of data from disk. Disable for in-memory reads.</td>
+  </tr>
+  <tr>
+    <td>planner.affinity_factor</td>
+    <td>1.2</td>
+    <td>Accepts inputs of type DOUBLE.</td>
+  </tr>
+  <tr>
+    <td>planner.broadcast_factor</td>
+    <td>1</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.broadcast_threshold</td>
+    <td>10000000</td>
+    <td>Threshold in number of rows that triggers a broadcast join for a query if the right side of the join contains fewer rows than the threshold. Avoids broadcasting too many rows to join. Range: 0-2147483647</td>
+  </tr>
+  <tr>
+    <td>planner.disable_exchanges</td>
+    <td>FALSE</td>
+    <td>Toggles the state of hashing to a random exchange.</td>
+  </tr>
+  <tr>
+    <td>planner.enable_broadcast_join</td>
+    <td>TRUE</td>
+    <td>Changes the state of aggregation and join operators. Do not disable.</td>
+  </tr>
+  <tr>
+    <td>planner.enable_demux_exchange</td>
+    <td>FALSE</td>
+    <td>Toggles the state of hashing to a demulitplexed exchange.</td>
+  </tr>
+  <tr>
+    <td>planner.enable_hash_single_key</td>
+    <td>TRUE</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.enable_hashagg</td>
+    <td>TRUE</td>
+    <td>Enable hash aggregation; otherwise, Drill does a sort-based aggregation. Does not write to disk. Enable is recommended.</td>
+  </tr>
+  <tr>
+    <td>planner.enable_hashjoin</td>
+    <td>TRUE</td>
+    <td>Enable the memory hungry hash join. Does not write to disk.</td>
+  </tr>
+  <tr>
+    <td>planner.enable_hashjoin_swap</td>
+    <td></td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.enable_mergejoin</td>
+    <td>TRUE</td>
+    <td>Sort-based operation. Writes to disk.</td>
+  </tr>
+  <tr>
+    <td>planner.enable_multiphase_agg</td>
+    <td>TRUE</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.enable_mux_exchange</td>
+    <td>TRUE</td>
+    <td>Toggles the state of hashing to a multiplexed exchange.</td>
+  </tr>
+  <tr>
+    <td>planner.enable_streamagg</td>
+    <td>TRUE</td>
+    <td>Sort-based operation. Writes to disk.</td>
+  </tr>
+  <tr>
+    <td>planner.identifier_max_length</td>
+    <td>1024</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.join.hash_join_swap_margin_factor</td>
+    <td>10</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.join.row_count_estimate_factor</td>
+    <td>1</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.memory.average_field_width</td>
+    <td>8</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.memory.enable_memory_estimation</td>
+    <td>FALSE</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.memory.hash_agg_table_factor</td>
+    <td>1.1</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.memory.hash_join_table_factor</td>
+    <td>1.1</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.memory.max_query_memory_per_node</td>
+    <td>2147483648</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.memory.non_blocking_operators_memory</td>
+    <td>64</td>
+    <td>Range: 0-2048</td>
+  </tr>
+  <tr>
+    <td>planner.partitioner_sender_max_threads</td>
+    <td>8</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.partitioner_sender_set_threads</td>
+    <td>-1</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.partitioner_sender_threads_factor</td>
+    <td>1</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>planner.producer_consumer_queue_size</td>
+    <td>10</td>
+    <td>How much data to prefetch from disk (in record batches) out of band of query execution</td>
+  </tr>
+  <tr>
+    <td>planner.slice_target</td>
+    <td>100000</td>
+    <td>The number of records manipulated within a fragment before Drill parallelizes operations.</td>
+  </tr>
+  <tr>
+    <td>planner.width.max_per_node</td>
+    <td>3</td>
+    <td>The maximum degree of distribution of a query across cores and cluster nodes.</td>
+  </tr>
+  <tr>
+    <td>planner.width.max_per_query</td>
+    <td>1000</td>
+    <td>Same as max per node but applies to the query as executed by the entire cluster.</td>
+  </tr>
+  <tr>
+    <td>store.format</td>
+    <td>parquet</td>
+    <td>Output format for data written to tables with the CREATE TABLE AS (CTAS) command. Allowed values are parquet, json, or text. Allowed values: 0, -1, 1000000</td>
+  </tr>
+  <tr>
+    <td>store.json.all_text_mode</a></td>
+    <td>FALSE</td>
+    <td>Drill reads all data from the JSON files as VARCHAR. Prevents schema change errors.</td>
+  </tr>
+  <tr>
+    <td>store.mongo.all_text_mode</td>
+    <td>FALSE</td>
+    <td>Similar to store.json.all_text_mode for MongoDB.</td>
+  </tr>
+  <tr>
+    <td>store.parquet.block-size</a></td>
+    <td>536870912</td>
+    <td>Sets the size of a Parquet row group to the number of bytes less than or equal to the block size of MFS, HDFS, or the file system.</td>
+  </tr>
+  <tr>
+    <td>store.parquet.compression</td>
+    <td>snappy</td>
+    <td>Compression type for storing Parquet output. Allowed values: snappy, gzip, none</td>
+  </tr>
+  <tr>
+    <td>store.parquet.enable_dictionary_encoding*</td>
+    <td>FALSE</td>
+    <td>Do not change.</td>
+  </tr>
+  <tr>
+    <td>store.parquet.use_new_reader</td>
+    <td>FALSE</td>
+    <td>Not supported</td>
+  </tr>
+  <tr>
+    <td>window.enable*</td>
+    <td>FALSE</td>
+    <td>Coming soon.</td>
+  </tr>
+</table>
+
+\* Not supported in this release.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/configuration-options/020-start-up-options.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/configuration-options/020-start-up-options.md b/_docs/configure-drill/configuration-options/020-start-up-options.md
new file mode 100644
index 0000000..8a06232
--- /dev/null
+++ b/_docs/configure-drill/configuration-options/020-start-up-options.md
@@ -0,0 +1,63 @@
+---
+title: "Start-Up Options"
+parent: "Configuration Options"
+---
+Drill’s start-up options reside in a HOCON configuration file format, which is
+a hybrid between a properties file and a JSON file. Drill start-up options
+consist of a group of files with a nested relationship. At the core of the
+file hierarchy is `drill-default.conf`. This file is overridden by one or more
+`drill-module.conf` files, which are overridden by the `drill-override.conf`
+file that you define.
+
+You can see the following group of files throughout the source repository in
+Drill:
+
+	common/src/main/resources/drill-default.conf
+	common/src/main/resources/drill-module.conf
+	contrib/storage-hbase/src/main/resources/drill-module.conf
+	contrib/storage-hive/core/src/main/resources/drill-module.conf
+	contrib/storage-hive/hive-exec-shade/src/main/resources/drill-module.conf
+	exec/java-exec/src/main/resources/drill-module.conf
+	distribution/src/resources/drill-override.conf
+
+These files are listed inside the associated JAR files in the Drill
+distribution tarball.
+
+Each Drill module has a set of options that Drill incorporates. Drill’s
+modular design enables you to create new storage plugins, set new operators,
+or create UDFs. You can also include additional configuration options that you
+can override as necessary.
+
+When you add a JAR file to Drill, you must include a `drill-module.conf` file
+in the root directory of the JAR file that you add. The `drill-module.conf`
+file tells Drill to scan that JAR file or associated object and include it.
+
+## Viewing Startup Options
+
+You can run the following query to see a list of Drill’s startup options:
+
+    SELECT * FROM sys.options WHERE type='BOOT'
+
+## Configuring Start-Up Options
+
+You can configure start-up options for each Drillbit in the `drill-
+override.conf` file located in Drill’s` /conf` directory.
+
+The summary of start-up options, also known as boot options, lists default values. The following descriptions provide more detail on key options that are frequently reconfigured:
+
+* drill.exec.sys.store.provider.class  
+  
+  Defines the persistent storage (PStore) provider. The [PStore]({{ site.baseurl }}/docs/persistent-configuration-storage) holds configuration and profile data. 
+
+* drill.exec.buffer.size
+
+  Defines the amount of memory available, in terms of record batches, to hold data on the downstream side of an operation. Drill pushes data downstream as quickly as possible to make data immediately available. This requires Drill to use memory to hold the data pending operations. When data on a downstream operation is required, that data is immediately available so Drill does not have to go over the network to process it. Providing more memory to this option increases the speed at which Drill completes a query.
+
+* drill.exec.sort.external.spill.directories
+
+  Tells Drill which directory to use when spooling. Drill uses a spool and sort operation for beyond memory operations. The sorting operation is designed to spool to a Hadoop file system. The default Hadoop file system is a local file system in the /tmp directory. Spooling performance (both writing and reading back from it) is constrained by the file system. For MapR clusters, use MapReduce volumes or set up local volumes to use for spooling purposes. Volumes improve performance and stripe data across as many disks as possible.
+
+
+* drill.exec.zk.connect  
+  Provides Drill with the ZooKeeper quorum to use to connect to data sources. Change this setting to point to the ZooKeeper quorum that you want Drill to use. You must configure this option on each Drillbit node.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/configuration-options/030-planning-and-exececution-options.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/configuration-options/030-planning-and-exececution-options.md b/_docs/configure-drill/configuration-options/030-planning-and-exececution-options.md
new file mode 100644
index 0000000..f7d3442
--- /dev/null
+++ b/_docs/configure-drill/configuration-options/030-planning-and-exececution-options.md
@@ -0,0 +1,60 @@
+---
+title: "Planning and Execution Options"
+parent: "Configuration Options"
+---
+You can set Drill query planning and execution options per cluster, at the
+system or session level. Options set at the session level only apply to
+queries that you run during the current Drill connection. Options set at the
+system level affect the entire system and persist between restarts. Session
+level settings override system level settings.
+
+You can run the following query to see a list of the system and session
+planning and execution options:
+
+    SELECT name FROM sys.options WHERE type in (SYSTEM, SESSION);
+
+## Configuring Planning and Execution Options
+
+Use the ALTER SYSTEM or ALTER SESSION commands to set options. Typically,
+you set the options at the session level unless you want the setting to
+persist across all sessions.
+
+The summary of system options lists default values. The following descriptions provide more detail on some of these options:
+
+### exec.min_hash_table_size
+
+The default starting size for hash tables. Increasing this size is useful for very large aggregations or joins when you have large amounts of memory for Drill to use. Drill can spend a lot of time resizing the hash table as it finds new data. If you have large data sets, you can increase this hash table size to increase performance.
+
+### planner.add_producer_consumer
+
+This option enables or disables a secondary reading thread that works out of band of the rest of the scanning fragment to prefetch data from disk. If you interact with a certain type of storage medium that is slow or does not prefetch much data, this option tells Drill to add a producer consumer reading thread to the operation. Drill can then assign one thread that focuses on a single reading fragment. If Drill is using memory, you can disable this option to get better performance. If Drill is using disk space, you should enable this option and set a reasonable queue size for the planner.producer_consumer_queue_size option.
+
+### planner.broadcast_threshold
+
+Threshold, in terms of a number of rows, that determines whether a broadcast join is chosen for a query. Regardless of the setting of the broadcast_join option (enabled or disabled), a broadcast join is not chosen unless the right side of the join is estimated to contain fewer rows than this threshold. The intent of this option is to avoid broadcasting too many rows for join purposes. Broadcasting involves sending data across nodes and is a network-intensive operation. (The &quot;right side&quot; of the join, which may itself be a join or simply a table, is determined by cost-based optimizations and heuristics during physical planning.)
+
+### planner.enable_broadcast_join, planner.enable_hashagg, planner.enable_hashjoin, planner.enable_mergejoin, planner.enable_multiphase_agg, planner.enable_streamagg
+
+These options enable or disable specific aggregation and join operators for queries. These operators are all enabled by default and in general should not be disabled.</p><p>Hash aggregation and hash join are hash-based operations. Streaming aggregation and merge join are sort-based operations. Both hash-based and sort-based operations consume memory; however, currently, hash-based operations do not spill to disk as needed, but the sort-based operations do. If large hash operations do not fit in memory on your system, you may need to disable these operations. Queries will continue to run, using alternative plans.
+
+### planner.producer_consumer_queue_size
+
+Determines how much data to prefetch from disk (in record batches) out of band of query execution. The larger the queue size, the greater the amount of memory that the queue and overall query execution consumes.
+
+### planner.width.max_per_node
+
+In this context *width* refers to fanout or distribution potential: the ability to run a query in parallel across the cores on a node and the nodes on a cluster. A physical plan consists of intermediate operations, known as query &quot;fragments,&quot; that run concurrently, yielding opportunities for parallelism above and below each exchange operator in the plan. An exchange operator represents a breakpoint in the execution flow where processing can be distributed. For example, a single-process scan of a file may flow into an exchange operator, followed by a multi-process aggregation fragment.
+
+The maximum width per node defines the maximum degree of parallelism for any fragment of a query, but the setting applies at the level of a single node in the cluster. The *default* maximum degree of parallelism per node is calculated as follows, with the theoretical maximum automatically scaled back (and rounded down) so that only 70% of the actual available capacity is taken into account: number of active drillbits (typically one per node) * number of cores per node * 0.7
+
+For example, on a single-node test system with 2 cores and hyper-threading enabled: 1 * 4 * 0.7 = 3
+
+When you modify the default setting, you can supply any meaningful number. The system does not automatically scale down your setting.
+
+### planner.width.max_per_query
+
+The max_per_query value also sets the maximum degree of parallelism for any given stage of a query, but the setting applies to the query as executed by the whole cluster (multiple nodes). In effect, the actual maximum width per query is the *minimum of two values*: min((number of nodes * width.max_per_node), width.max_per_query)
+
+For example, on a 4-node cluster where `width.max_per_node` is set to 6 and `width.max_per_query` is set to 30: min((4 * 6), 30) = 24
+
+In this case, the effective maximum width per query is 24, not 30.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/configure-drill/configuration-options/040-persistent-configuration-storage.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/configuration-options/040-persistent-configuration-storage.md b/_docs/configure-drill/configuration-options/040-persistent-configuration-storage.md
new file mode 100644
index 0000000..59180b5
--- /dev/null
+++ b/_docs/configure-drill/configuration-options/040-persistent-configuration-storage.md
@@ -0,0 +1,92 @@
+---
+title: "Persistent Configuration Storage"
+parent: "Configuration Options"
+---
+Drill stores persistent configuration data in a persistent configuration store
+(PStore). This data is encoded in JSON or Protobuf format. Drill can use the
+local file system, ZooKeeper, HBase, or MapR-DB to store this data. The data
+stored in a PStore includes state information for storage plugins, query
+profiles, and ALTER SYSTEM settings. The default type of PStore configured
+depends on the Drill installation mode.
+
+The following table provides the persistent storage mode for each of the Drill
+modes:
+
+| Mode        | Description                                                                                                                                                             |
+|-------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Embedded    | Drill stores persistent data in the local file system. You cannot modify the PStore location for Drill in embedded mode.                                                |
+| Distributed | Drill stores persistent data in ZooKeeper, by default. You can modify where ZooKeeper offloads data, or you can change the persistent storage mode to HBase or MapR-DB. |
+  
+{% include startnote.html %}Switching between storage modes does not migrate configuration data.{% include endnote.html %}
+
+## ZooKeeper for Persistent Configuration Storage
+
+To make Drill installation and configuration simple, Drill uses ZooKeeper to
+store persistent configuration data. The ZooKeeper PStore provider stores all
+of the persistent configuration data in ZooKeeper except for query profile
+data.
+
+The ZooKeeper PStore provider offloads query profile data to the
+${DRILL_LOG_DIR:-/var/log/drill} directory on Drill nodes. If you want the
+query profile data stored in a specific location, you can configure where
+ZooKeeper offloads the data.
+
+To modify where the ZooKeeper PStore provider offloads query profile data,
+configure the `sys.store.provider.zk.blobroot` property in the `drill.exec`
+block in `<drill_installation_directory>/conf/drill-override.conf` on each
+Drill node and then restart the Drillbit service.
+
+**Example**
+
+	drill.exec: {
+	 cluster-id: "my_cluster_com-drillbits",
+	 zk.connect: "<zkhostname>:<port>",
+	 sys.store.provider.zk.blobroot: "maprfs://<directory to store pstore data>/"
+	}
+
+Issue the following command to restart the Drillbit on all Drill nodes:
+
+    maprcli node services -name drill-bits -action restart -nodes <node IP addresses separated by a space>
+
+## HBase for Persistent Configuration Storage
+
+To change the persistent storage mode for Drill, add or modify the
+`sys.store.provider` block in `<drill_installation_directory>/conf/drill-
+override.conf.`
+
+**Example**
+
+	sys.store.provider: {
+	    class: "org.apache.drill.exec.store.hbase.config.HBasePStoreProvider",
+	    hbase: {
+	      table : "drill_store",
+	      config: {
+	      "hbase.zookeeper.quorum": "<ip_address>,<ip_address>,<ip_address >,<ip_address>",
+	      "hbase.zookeeper.property.clientPort": "2181"
+	      }
+	    }
+	  },
+
+## MapR-DB for Persistent Configuration Storage
+
+If you have MapR-DB in your cluster, you can use MapR-DB for persistent
+configuration storage. Using MapR-DB to store persistent configuration data
+can prevent memory strain on ZooKeeper in clusters running heavy workloads.
+
+To change the persistent storage mode to MapR-DB, add or modify the
+`sys.store.provider` block in `<drill_installation_directory>/conf/drill-
+override.conf` on each Drill node and then restart the Drillbit service.
+
+**Example**
+
+	sys.store.provider: {
+	class: "org.apache.drill.exec.store.hbase.config.HBasePStoreProvider",
+	hbase: {
+	  table : "/tables/drill_store",
+	    }
+	},
+
+Issue the following command to restart the Drillbit on all Drill nodes:
+
+    maprcli node services -name drill-bits -action restart -nodes <node IP addresses separated by a space>
+

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/install/050-starting-drill-in-distributed mode.md
----------------------------------------------------------------------
diff --git a/_docs/install/050-starting-drill-in-distributed mode.md b/_docs/install/050-starting-drill-in-distributed mode.md
index b9928be..97c50df 100644
--- a/_docs/install/050-starting-drill-in-distributed mode.md	
+++ b/_docs/install/050-starting-drill-in-distributed mode.md	
@@ -71,4 +71,16 @@ Drill provides a list of Drillbits that have joined.
     +------------+------------+--------------+--------------------+
 
 Now you can run queries. The Drill installation includes sample data
-that you can query. Refer to [Querying Parquet Files]({{ site.baseurl }}/docs/querying-parquet-files/).
\ No newline at end of file
+that you can query. Refer to [Querying Parquet Files]({{ site.baseurl }}/docs/querying-parquet-files/).
+
+## Exiting SQLLine
+
+To exit SQLLine, issue the following command:
+
+    !quit
+
+## Stopping Drill
+
+In some cases, such as stopping while a query is in progress, the `!quit` command does not stop Drill running in embedded mode. In distributed mode, you stop the Drillbit service. Navigate to the Drill installation directory, and issue the following command to stop a Drillbit:
+  
+        bin/drillbit.sh stop

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/install/installing-drill-in-embedded-mode/030-starting-drill-on-linux-and-mac-os-x.md
----------------------------------------------------------------------
diff --git a/_docs/install/installing-drill-in-embedded-mode/030-starting-drill-on-linux-and-mac-os-x.md b/_docs/install/installing-drill-in-embedded-mode/030-starting-drill-on-linux-and-mac-os-x.md
index 2e9d60a..e19f224 100644
--- a/_docs/install/installing-drill-in-embedded-mode/030-starting-drill-on-linux-and-mac-os-x.md
+++ b/_docs/install/installing-drill-in-embedded-mode/030-starting-drill-on-linux-and-mac-os-x.md
@@ -16,6 +16,35 @@ Launch SQLLine using the sqlline command to start to Drill in embedded mode. The
 
    At this point, you can [submit queries]({{site.baseurl}}/docs/drill-in-10-minutes#query-sample-data) to Drill.
 
+## Example of Starting Drill
+
+The simplest example of how to start SQLLine is to identify the protocol, JDBC, and zookeeper node or nodes in the **sqlline** command. This example starts SQLLine on a node in an embedded, single-node cluster:
+
+    sqlline -u jdbc:drill:zk=local
+
+This example also starts SQLLine using the `dfs` storage plugin. Specifying the storage plugin when you start up eliminates the need to specify the storage plugin in the query:
+
+
+    bin/sqlline –u jdbc:drill:schema=dfs;zk=centos26
+    
 You can use the schema option in the **sqlline** command to specify a storage plugin. Specifying the storage plugin when you start up eliminates the need to specify the storage plugin in the query: For example, this command specifies the `dfs` storage plugin.
 
     bin/sqlline –u jdbc:drill:schema=dfs;zk=local
+
+## Exiting SQLLine
+
+To exit SQLLine, issue the following command:
+
+    !quit
+
+## Stopping Drill
+
+In some cases, such as stopping while a query is in progress, the `!quit` command does not stop Drill running in embedded mode. To stop the Drill process on Mac OS X and Linux, use the kill command. For example, on Mac OS X and Linux, follow these steps:
+
+  1. Issue a CTRL Z to stop the query, then start Drill again. If the startup message indicates success, skip the rest of the steps. If not, proceed to step 2.
+  2. Search for the Drill process IDs.
+  
+        $ ps auwx | grep drill
+  3. Kill each process using the process numbers in the grep output. For example:
+
+        $ sudo kill -9 2674 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/install/installing-drill-in-embedded-mode/050-starting-drill-on-windows.md
----------------------------------------------------------------------
diff --git a/_docs/install/installing-drill-in-embedded-mode/050-starting-drill-on-windows.md b/_docs/install/installing-drill-in-embedded-mode/050-starting-drill-on-windows.md
index ef4c97c..75c25b6 100644
--- a/_docs/install/installing-drill-in-embedded-mode/050-starting-drill-on-windows.md
+++ b/_docs/install/installing-drill-in-embedded-mode/050-starting-drill-on-windows.md
@@ -16,4 +16,15 @@ At this point, you can [submit queries]({{ site.baseurl }}/docs/drill-in-10-minu
 
 You can use the schema option in the **sqlline** command to specify a storage plugin. Specifying the storage plugin when you start up eliminates the need to specify the storage plugin in the query: For example, this command specifies the `dfs` storage plugin.
 
-    bin/sqlline –u jdbc:drill:schema=dfs;zk=local
\ No newline at end of file
+    bin/sqlline –u jdbc:drill:schema=dfs;zk=local
+
+## Exiting SQLLine
+
+To exit SQLLine, issue the following command:
+
+    !quit
+
+## Stopping Drill
+
+In some cases, such as stopping while a query is in progress, the `!quit` command does not stop Drill running in embedded mode. To stop the Drill process use the [**TaskKill**](https://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskkill.mspx?mfr=true) command.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/manage-drill/010-manage-drill-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/manage-drill/010-manage-drill-introduction.md b/_docs/manage-drill/010-manage-drill-introduction.md
deleted file mode 100644
index bc9179a..0000000
--- a/_docs/manage-drill/010-manage-drill-introduction.md
+++ /dev/null
@@ -1,7 +0,0 @@
----
-title: "Manage Drill Introduction"
-parent: "Manage Drill"
----
-When using Drill, you need to make sufficient memory available Drill and other workloads running on the cluster. You might want to modify options for performance or functionality. For example, the default storage format for CTAS
-statements is Parquet. Using a configuration option, you can modify the default setting so that output data
-is stored in CSV or JSON format. The section covers the many options you can configure and how to configure memory resources for Drill running along side other workloads. This section also includes stopping and restarting a Drillbit on a node, ports used by Drill, and partition pruning.

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/manage-drill/020-configuring-drill-in-a-dedicated-cluster.md
----------------------------------------------------------------------
diff --git a/_docs/manage-drill/020-configuring-drill-in-a-dedicated-cluster.md b/_docs/manage-drill/020-configuring-drill-in-a-dedicated-cluster.md
deleted file mode 100644
index 446f75c..0000000
--- a/_docs/manage-drill/020-configuring-drill-in-a-dedicated-cluster.md
+++ /dev/null
@@ -1,30 +0,0 @@
----
-title: "Configuring Drill in a Dedicated Cluster"
-parent: "Manage Drill"
----
-
-This section describes how to configure the amount of direct memory allocated to a Drillbit for query processing in a dedicated Drill cluster. When you use Drill in a cluster with other workloads, configure memory as described in section, ["Configuring Resources in a Mixed Cluster"]({{site.baseurl}}/docs/configuring-resources-in-a-mixed-cluster). 
-
-The default memory for a Drillbit is 8G, but Drill prefers 16G or more
-depending on the workload. The total amount of direct memory that a Drillbit
-allocates to query operations cannot exceed the limit set.
-
-Drill mainly uses Java direct memory and performs well when executing
-operations in memory instead of storing the operations on disk. Drill does not
-write to disk unless absolutely necessary, unlike MapReduce where everything
-is written to disk during each phase of a job.
-
-The JVM’s heap memory does not limit the amount of direct memory available in
-a Drillbit. The on-heap memory for Drill is only about 4-8G, which should
-suffice because Drill avoids having data sit in heap memory.
-
-## Modifying Drillbit Memory
-
-You can modify memory for each Drillbit node in your cluster. To modify the
-memory for a Drillbit, edit the `XX:MaxDirectMemorySize` parameter in the
-Drillbit startup script located in `<drill_installation_directory>/conf/drill-
-env.sh`.
-
-{% include startnote.html %}If this parameter is not set, the limit depends on the amount of available system memory.{% include endnote.html %}
-
-After you edit `<drill_installation_directory>/conf/drill-env.sh`, [restart the Drillbit]({{ site.baseurl }}/docs/starting-drill-in-distributed-mode) on the node.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/manage-drill/030-configuring-a-multitenant-cluster-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/manage-drill/030-configuring-a-multitenant-cluster-introduction.md b/_docs/manage-drill/030-configuring-a-multitenant-cluster-introduction.md
deleted file mode 100644
index 978d374..0000000
--- a/_docs/manage-drill/030-configuring-a-multitenant-cluster-introduction.md
+++ /dev/null
@@ -1,22 +0,0 @@
----
-title: "Configuring a Multitenant Cluster Introduction"
-parent: "Configuring a Multitenant Cluster"
----
-
-Drill supports multiple users sharing a Drillbit. You can also run separate Drillbits running on different nodes in the cluster.
-
-Drill typically runs along side other workloads, including the following:  
-
-* Mapreduce  
-* Yarn  
-* HBase  
-* Hive and Pig  
-* Spark  
-
-You need to plan and configure these resources for use with Drill and other workloads: 
-
-* [Memory]({{site.baseurl}}/docs/configuring-multitenant-resources)  
-* [CPU]({{site.baseurl}}/docs/configuring-multitenant-resources#how-to-manage-drill-cpu-resources)  
-* Disk  
-
-Configure, memory, queues, and parallelization when users [share a Drillbit]({{site.baseurl}}/docs/configuring-resources-for-a-shared-drillbit).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/manage-drill/040-configuring-a-multitenant-cluster.md
----------------------------------------------------------------------
diff --git a/_docs/manage-drill/040-configuring-a-multitenant-cluster.md b/_docs/manage-drill/040-configuring-a-multitenant-cluster.md
deleted file mode 100644
index fe72675..0000000
--- a/_docs/manage-drill/040-configuring-a-multitenant-cluster.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-title: "Configuring a Multitenant Cluster"
-parent: "Manage Drill"
----
-

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/manage-drill/050-configuring-multitenant-resources.md
----------------------------------------------------------------------
diff --git a/_docs/manage-drill/050-configuring-multitenant-resources.md b/_docs/manage-drill/050-configuring-multitenant-resources.md
deleted file mode 100644
index 9a944e8..0000000
--- a/_docs/manage-drill/050-configuring-multitenant-resources.md
+++ /dev/null
@@ -1,80 +0,0 @@
----
-title: "Configuring Multitenant Resources"
-parent: "Configuring a Multitenant Cluster"
----
-Drill operations are memory and CPU-intensive. You need to statically partition the cluster to designate which partition handles which workload. To configure resources for Drill in a MapR cluster, modify one or more of the following files in `/opt/mapr/conf/conf.d` that the installation process creates. 
-
-* `warden.drill-bits.conf`
-* `warden.nodemanager.conf`
-* `warden.resourcemanager.conf`
-
-Configure Drill memory by modifying `warden.drill-bits.conf` in YARN and non-YARN clusters. Configure other resources by modifying `warden.nodemanager.conf `and `warden.resourcemanager.conf `in a YARN-enabled cluster.
-
-## Configuring Drill Memory in a Mixed Cluster
-
-Add the following lines to the `warden.drill-bits.conf` file to configure memory resources for Drill:
-
-    service.heapsize.min=<some value in MB>
-    service.heapsize.max=<some value in MB>
-    service.heapsize.percent=<a whole number>
-
-The service.heapsize.percent is the percentage of memory for the service bounded by minimum and maximum values.
-
-## Configuring Drill in a YARN-enabled MapR Cluster
-
-To add Drill to a YARN-enabled cluster, change memory resources to suit your application. For example, you have 120G of available memory that you allocate to following workloads in a Yarn-enabled cluster:
-
-File system = 20G  
-HBase = 20G  
-Yarn = 20G  
-OS = 8G  
-
-If Yarn does most of the work, give Drill 20G, for example, and give Yarn 60G. If you expect a heavy query load, give Drill 60G and Drill 20G.
-
-{% include startnote.html %}Drill will execute queries within Yarn soon.{% include endnote.html %} [DRILL-142](https://issues.apache.org/jira/browse/DRILL-142)
-
-YARN consists of two main services:
-
-* ResourceManager  
-  There is at least one instance in a cluster, more if you configure high availability.  
-* NodeManager  
-  There is one instance per node. 
-
-ResourceManager and NodeManager memory in `warden.resourcemanager.conf` and
- `warden.nodemanager.conf` are set to the following defaults. 
-
-    service.heapsize.min=64
-    service.heapsize.max=325
-    service.heapsize.percent=2
-
-Change these settings for NodeManager and ResourceManager to reconfigure the total memory required for YARN services to run. If you want to place an upper limit on memory set YARN_NODEMANAGER_HEAPSIZE or YARN_RESOURCEMANAGER_HEAPSIZE environment variable in /opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop/yarn-env.sh. The -Xmx option is not set, allowing memory on to grow as needed.
-
-### MapReduce v1 Resources
-
-The following default settings in /opt/mapr/conf/warden.conf control MapReduce v1 memory:
-
-    mr1.memory.percent=50
-    mr1.cpu.percent=50
-    mr1.disk.percent=50
-
-Modify these settings to reconfigure MapReduce v1 resources to suit your application needs, as described in section ["Resource Allocation for Jobs and Applications"](http://doc.mapr.com/display/MapR/Resource+Allocation+for+Jobs+and+Applications) of the MapR documentation. Remaining memory is given to YARN applications. 
-
-
-### MapReduce v2 and other Resources
-
-You configure memory for each service by setting three values in `warden.conf`.
-
-    service.command.<servicename>.heapsize.percent
-    service.command.<servicename>.heapsize.max
-    service.command.<servicename>.heapsize.min
-
-Configure memory for other services in the same manner, as described in [MapR documentation](http://doc.mapr.com/display/MapR/warden.%3Cservicename%3E.conf)
-
-For more information about managing memory in a MapR cluster, see the following sections in the MapR documentation:
-
-* [Memory Allocation for Nodes](http://doc.mapr.com/display/MapR40x/Memory+Allocation+for+Nodes)  
-* [Cluster Resource Allocation](http://doc.mapr.com/display/MapR40x/Cluster+Resource+Allocation)  
-* [Customizing Memory Settings for MapReduce v1](http://doc.mapr.com/display/MapR40x/Customize+Memory+Settings+for+MapReduce+v1)  
-
-## How to Manage Drill CPU Resources
-Currently, you do not manage CPU resources within Drill. [Use Linux `cgroups`](http://en.wikipedia.org/wiki/Cgroups) to manage the CPU resources.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/27067e8c/_docs/manage-drill/060-configuring-a-shared-drillbit.md
----------------------------------------------------------------------
diff --git a/_docs/manage-drill/060-configuring-a-shared-drillbit.md b/_docs/manage-drill/060-configuring-a-shared-drillbit.md
deleted file mode 100644
index 3f83736..0000000
--- a/_docs/manage-drill/060-configuring-a-shared-drillbit.md
+++ /dev/null
@@ -1,65 +0,0 @@
----
-title: "Configuring Resources for a Shared Drillbit"
-parent: "Configuring a Multitenant Cluster"
----
-To manage a cluster in which multiple users share a Drillbit, you configure Drill queuing and parallelization in addition to memory, as described in the previous section.
-
-##Configuring Drill Query Queuing
-
-Set [options in sys.options]({{site.baseurl}}/docs/configuration-options-introduction/) to enable and manage query queuing, which is turned off by default. There are two types of queues: large and small. You configure a maximum number of queries that each queue allows by configuring the following options in the `sys.options` table:
-
-* exec.queue.large  
-* exec.queue.small  
-
-### Example Configuration
-
-For example, you configure the queue reserved for large queries to hold a 5-query maximum. You configure the queue reserved for small queries to hold 20 queries. Users start to run queries, and Drill receives the following query requests in this order:
-
-* Query A (blue): 1 billion records, Drill estimates 10 million rows will be processed  
-* Query B (red): 2 billion records, Drill estimates 20 million rows will be processed  
-* Query C: 1 billion records  
-* Query D: 100 records
-
-The exec.queue.threshold default is 30 million, which is the estimated rows to be processed by the query. Queries A and B are queued in the large queue. The estimated rows to be processed reaches the 30 million threshold, filling the queue to capacity. The query C request arrives and goes on the wait list, and then query D arrives. Query D is queued immediately in the small queue because of its small size, as shown in the following diagram: 
-
-![drill queuing]({{ site.baseurl }}/docs/img/queuing.png)
-
-The Drill queuing configuration in this example tends to give many users running small queries a rapid response. Users running a large query might experience some delay until an earlier-received large query returns, freeing space in the large queue to process queries that are waiting.
-
-## Controlling Parallelization
-
-By default, Drill parallelizes operations when number of records manipulated within a fragment reaches 100,000. When parallelization of operations is high, the cluster operates as fast as possible, which is fine for a single user. In a contentious multi-tenant situation, however, you need to reduce parallelization to levels based on user needs.
-
-### Parallelization Configuration Procedure
-
-To configure parallelization, configure the following options in the `sys.options` table:
-
-* `planner.width.max.per.node`  
-  The maximum degree of distribution of a query across cores and cluster nodes.
-* `planner.width.max.per.query`  
-  Same as max per node but applies to the query as executed by the entire cluster.
-
-Configure the `planner.width.max.per.node` to achieve fine grained, absolute control over parallelization. 
-
-<!-- ??For example, setting the `planner.width.max.per.query` to 60 will not accelerate Drill operations because overlapping does not occur when executing 60 queries at the same time.??
-
-### Example of Configuring Parallelization
-
-For example, the default settings parallelize 70 percent of operations up to 1,000 cores. If you have 30 cores per node in a 10-node cluster, or 300 cores, parallelization occurs on approximately 210 cores. Consequently, a single user can get 70 percent usage from a cluster and no more due to the constraints configured by the `planner.width.max.per.query`.
-
-A parallelizer in the Foreman transforms the physical plan into multiple phases. A complicated query can have multiple, major fragments. A default parallelization of 70 percent of operations allows some overlap of query phases. In the example, 210 ??for each core or major fragment to a maximum of 410??.
-
-??Drill uses pipelines, blocking/nonblocking, memory is not fungible. CPU resources are fungible. There is contention for CPUs.?? -->
-
-## Data Isolation
-
-Tenants can share data on a cluster using Drill views and impersonation. ??Link to impersonation doc.??
-
-
-
-
-
-
-
-
-


Mime
View raw message