drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tshi...@apache.org
Subject [1/9] drill git commit: BB's perf tuning
Date Tue, 19 May 2015 16:30:03 GMT
Repository: drill
Updated Branches:
  refs/heads/gh-pages d22ac4af0 -> 9c99ecfb7


BB's perf tuning


Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/bb0710bc
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/bb0710bc
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/bb0710bc

Branch: refs/heads/gh-pages
Commit: bb0710bc89dca0f7f176d14862d9b70f4034dbea
Parents: d22ac4a
Author: Kristine Hahn <khahn@maprtech.com>
Authored: Tue May 19 00:23:37 2015 -0700
Committer: Kristine Hahn <khahn@maprtech.com>
Committed: Tue May 19 00:23:37 2015 -0700

----------------------------------------------------------------------
 _data/docs.json                                 | 576 ++++++++++++++++++-
 _docs/img/data_skew.png                         | Bin 0 -> 21902 bytes
 _docs/img/frag_profile.png                      | Bin 0 -> 79978 bytes
 _docs/img/graph_1.png                           | Bin 0 -> 21056 bytes
 _docs/img/list_queries.png                      | Bin 0 -> 89406 bytes
 _docs/img/maj_frag_block.png                    | Bin 0 -> 24238 bytes
 _docs/img/operator_block.png                    | Bin 0 -> 30127 bytes
 _docs/img/operator_table.png                    | Bin 0 -> 37785 bytes
 _docs/img/phys_plan_profile.png                 | Bin 0 -> 113422 bytes
 _docs/img/query_profile.png                     | Bin 0 -> 74428 bytes
 _docs/img/query_queuing.png                     | Bin 0 -> 17413 bytes
 _docs/img/submit_plan.png                       | Bin 0 -> 63266 bytes
 _docs/img/vis_graph.png                         | Bin 0 -> 40854 bytes
 _docs/img/xx-xx-xx.png                          | Bin 0 -> 10795 bytes
 .../performance-tuning/020-partition-pruning.md |  96 ++++
 .../030-choosing-a-storage-format.md            |  16 +
 .../010-query-plans-and-tuning-introduction.md  |   7 +
 .../020-join-planning-guidelines.md             |  43 ++
 ...030-guidelines-for-optimizing-aggregation.md |  21 +
 .../040-modifying-query-planning-options.md     |  34 ++
 ...d-hash-based-memory-constrained-operators.md |  39 ++
 .../060-enabling-query-queuing.md               |  48 ++
 ...to-balance-performance-with-multi-tenancy.md |  10 +
 .../010-query-plans.md                          |  74 +++
 .../020-query-profiles.md                       | 142 +++++
 25 files changed, 1090 insertions(+), 16 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_data/docs.json
----------------------------------------------------------------------
diff --git a/_data/docs.json b/_data/docs.json
index 6566216..7007676 100644
--- a/_data/docs.json
+++ b/_data/docs.json
@@ -597,6 +597,23 @@
             "title": "CREATE VIEW", 
             "url": "/docs/create-view/"
         }, 
+        "Choosing a Storage Format": {
+            "breadcrumbs": [
+                {
+                    "title": "Performance Tuning", 
+                    "url": "/docs/performance-tuning/"
+                }
+            ], 
+            "children": [], 
+            "next_title": "Query Plans and Tuning Introduction", 
+            "next_url": "/docs/query-plans-and-tuning-introduction/", 
+            "parent": "Performance Tuning", 
+            "previous_title": "Partition Pruning", 
+            "previous_url": "/docs/partition-pruning/", 
+            "relative_path": "_docs/performance-tuning/030-choosing-a-storage-format.md", 
+            "title": "Choosing a Storage Format", 
+            "url": "/docs/choosing-a-storage-format/"
+        }, 
         "Compiling Drill from Source": {
             "breadcrumbs": [
                 {
@@ -1650,6 +1667,23 @@
             "title": "Contribute to Drill", 
             "url": "/docs/contribute-to-drill/"
         }, 
+        "Controlling Parallelization to Balance Performance with Multi-Tenancy": {
+            "breadcrumbs": [
+                {
+                    "title": "Performance Tuning", 
+                    "url": "/docs/performance-tuning/"
+                }
+            ], 
+            "children": [], 
+            "next_title": "Query Plans", 
+            "next_url": "/docs/query-plans/", 
+            "parent": "Performance Tuning", 
+            "previous_title": "Enabling Query Queuing", 
+            "previous_url": "/docs/enabling-query-queuing/", 
+            "relative_path": "_docs/performance-tuning/query-plans-and-tuning/070-controlling-parallelization-to-balance-performance-with-multi-tenancy.md", 
+            "title": "Controlling Parallelization to Balance Performance with Multi-Tenancy", 
+            "url": "/docs/controlling-parallelization-to-balance-performance-with-multi-tenancy/"
+        }, 
         "Core Modules": {
             "breadcrumbs": [
                 {
@@ -2788,6 +2822,23 @@
             "title": "Embedded Mode Prerequisites", 
             "url": "/docs/embedded-mode-prerequisites/"
         }, 
+        "Enabling Query Queuing": {
+            "breadcrumbs": [
+                {
+                    "title": "Performance Tuning", 
+                    "url": "/docs/performance-tuning/"
+                }
+            ], 
+            "children": [], 
+            "next_title": "Controlling Parallelization to Balance Performance with Multi-Tenancy", 
+            "next_url": "/docs/controlling-parallelization-to-balance-performance-with-multi-tenancy/", 
+            "parent": "Performance Tuning", 
+            "previous_title": "Sort-Based and Hash-Based Memory-Constrained Operators", 
+            "previous_url": "/docs/sort-based-and-hash-based-memory-constrained-operators/", 
+            "relative_path": "_docs/performance-tuning/query-plans-and-tuning/060-enabling-query-queuing.md", 
+            "title": "Enabling Query Queuing", 
+            "url": "/docs/enabling-query-queuing/"
+        }, 
         "Enron Emails": {
             "breadcrumbs": [
                 {
@@ -2970,6 +3021,23 @@
             "title": "Getting to Know the Drill Sandbox", 
             "url": "/docs/getting-to-know-the-drill-sandbox/"
         }, 
+        "Guidelines for Optimizing Aggregation": {
+            "breadcrumbs": [
+                {
+                    "title": "Performance Tuning", 
+                    "url": "/docs/performance-tuning/"
+                }
+            ], 
+            "children": [], 
+            "next_title": "Modifying Query Planning Options", 
+            "next_url": "/docs/modifying-query-planning-options/", 
+            "parent": "Performance Tuning", 
+            "previous_title": "Join Planning Guidelines", 
+            "previous_url": "/docs/join-planning-guidelines/", 
+            "relative_path": "_docs/performance-tuning/query-plans-and-tuning/030-guidelines-for-optimizing-aggregation.md", 
+            "title": "Guidelines for Optimizing Aggregation", 
+            "url": "/docs/guidelines-for-optimizing-aggregation/"
+        }, 
         "HBase Storage Plugin": {
             "breadcrumbs": [
                 {
@@ -3703,6 +3771,23 @@
             "title": "JSON Data Model", 
             "url": "/docs/json-data-model/"
         }, 
+        "Join Planning Guidelines": {
+            "breadcrumbs": [
+                {
+                    "title": "Performance Tuning", 
+                    "url": "/docs/performance-tuning/"
+                }
+            ], 
+            "children": [], 
+            "next_title": "Guidelines for Optimizing Aggregation", 
+            "next_url": "/docs/guidelines-for-optimizing-aggregation/", 
+            "parent": "Performance Tuning", 
+            "previous_title": "Query Plans and Tuning Introduction", 
+            "previous_url": "/docs/query-plans-and-tuning-introduction/", 
+            "relative_path": "_docs/performance-tuning/query-plans-and-tuning/020-join-planning-guidelines.md", 
+            "title": "Join Planning Guidelines", 
+            "url": "/docs/join-planning-guidelines/"
+        }, 
         "KVGEN": {
             "breadcrumbs": [
                 {
@@ -4122,6 +4207,23 @@
             "title": "Modify logback.xml", 
             "url": "/docs/modify-logback-xml/"
         }, 
+        "Modifying Query Planning Options": {
+            "breadcrumbs": [
+                {
+                    "title": "Performance Tuning", 
+                    "url": "/docs/performance-tuning/"
+                }
+            ], 
+            "children": [], 
+            "next_title": "Sort-Based and Hash-Based Memory-Constrained Operators", 
+            "next_url": "/docs/sort-based-and-hash-based-memory-constrained-operators/", 
+            "parent": "Performance Tuning", 
+            "previous_title": "Guidelines for Optimizing Aggregation", 
+            "previous_url": "/docs/guidelines-for-optimizing-aggregation/", 
+            "relative_path": "_docs/performance-tuning/query-plans-and-tuning/040-modifying-query-planning-options.md", 
+            "title": "Modifying Query Planning Options", 
+            "url": "/docs/modifying-query-planning-options/"
+        }, 
         "MongoDB Plugin for Apache Drill": {
             "breadcrumbs": [
                 {
@@ -4799,17 +4901,17 @@
         "Partition Pruning": {
             "breadcrumbs": [
                 {
-                    "title": "Archived Pages", 
-                    "url": "/docs/archived-pages/"
+                    "title": "Performance Tuning", 
+                    "url": "/docs/performance-tuning/"
                 }
             ], 
             "children": [], 
-            "next_title": "Progress Reports", 
-            "next_url": "/docs/progress-reports/", 
-            "parent": "Archived Pages", 
-            "previous_title": "What is Apache Drill", 
-            "previous_url": "/docs/what-is-apache-drill/", 
-            "relative_path": "_docs/archived-pages/030-partition-pruning.md", 
+            "next_title": "Choosing a Storage Format", 
+            "next_url": "/docs/choosing-a-storage-format/", 
+            "parent": "Performance Tuning", 
+            "previous_title": "Performance Tuning Introduction", 
+            "previous_url": "/docs/performance-tuning-introduction/", 
+            "relative_path": "_docs/performance-tuning/020-partition-pruning.md", 
             "title": "Partition Pruning", 
             "url": "/docs/partition-pruning/"
         }, 
@@ -4841,14 +4943,201 @@
                         }
                     ], 
                     "children": [], 
-                    "next_title": "Log and Debug", 
-                    "next_url": "/docs/log-and-debug/", 
+                    "next_title": "Partition Pruning", 
+                    "next_url": "/docs/partition-pruning/", 
                     "parent": "Performance Tuning", 
                     "previous_title": "Performance Tuning", 
                     "previous_url": "/docs/performance-tuning/", 
                     "relative_path": "_docs/performance-tuning/010-performance-tuning-introduction.md", 
                     "title": "Performance Tuning Introduction", 
                     "url": "/docs/performance-tuning-introduction/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Choosing a Storage Format", 
+                    "next_url": "/docs/choosing-a-storage-format/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Performance Tuning Introduction", 
+                    "previous_url": "/docs/performance-tuning-introduction/", 
+                    "relative_path": "_docs/performance-tuning/020-partition-pruning.md", 
+                    "title": "Partition Pruning", 
+                    "url": "/docs/partition-pruning/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Query Plans and Tuning Introduction", 
+                    "next_url": "/docs/query-plans-and-tuning-introduction/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Partition Pruning", 
+                    "previous_url": "/docs/partition-pruning/", 
+                    "relative_path": "_docs/performance-tuning/030-choosing-a-storage-format.md", 
+                    "title": "Choosing a Storage Format", 
+                    "url": "/docs/choosing-a-storage-format/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Join Planning Guidelines", 
+                    "next_url": "/docs/join-planning-guidelines/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Choosing a Storage Format", 
+                    "previous_url": "/docs/choosing-a-storage-format/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/010-query-plans-and-tuning-introduction.md", 
+                    "title": "Query Plans and Tuning Introduction", 
+                    "url": "/docs/query-plans-and-tuning-introduction/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Guidelines for Optimizing Aggregation", 
+                    "next_url": "/docs/guidelines-for-optimizing-aggregation/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Query Plans and Tuning Introduction", 
+                    "previous_url": "/docs/query-plans-and-tuning-introduction/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/020-join-planning-guidelines.md", 
+                    "title": "Join Planning Guidelines", 
+                    "url": "/docs/join-planning-guidelines/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Modifying Query Planning Options", 
+                    "next_url": "/docs/modifying-query-planning-options/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Join Planning Guidelines", 
+                    "previous_url": "/docs/join-planning-guidelines/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/030-guidelines-for-optimizing-aggregation.md", 
+                    "title": "Guidelines for Optimizing Aggregation", 
+                    "url": "/docs/guidelines-for-optimizing-aggregation/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Sort-Based and Hash-Based Memory-Constrained Operators", 
+                    "next_url": "/docs/sort-based-and-hash-based-memory-constrained-operators/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Guidelines for Optimizing Aggregation", 
+                    "previous_url": "/docs/guidelines-for-optimizing-aggregation/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/040-modifying-query-planning-options.md", 
+                    "title": "Modifying Query Planning Options", 
+                    "url": "/docs/modifying-query-planning-options/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Enabling Query Queuing", 
+                    "next_url": "/docs/enabling-query-queuing/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Modifying Query Planning Options", 
+                    "previous_url": "/docs/modifying-query-planning-options/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/050-sort-based-and-hash-based-memory-constrained-operators.md", 
+                    "title": "Sort-Based and Hash-Based Memory-Constrained Operators", 
+                    "url": "/docs/sort-based-and-hash-based-memory-constrained-operators/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Controlling Parallelization to Balance Performance with Multi-Tenancy", 
+                    "next_url": "/docs/controlling-parallelization-to-balance-performance-with-multi-tenancy/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Sort-Based and Hash-Based Memory-Constrained Operators", 
+                    "previous_url": "/docs/sort-based-and-hash-based-memory-constrained-operators/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/060-enabling-query-queuing.md", 
+                    "title": "Enabling Query Queuing", 
+                    "url": "/docs/enabling-query-queuing/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Query Plans", 
+                    "next_url": "/docs/query-plans/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Enabling Query Queuing", 
+                    "previous_url": "/docs/enabling-query-queuing/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/070-controlling-parallelization-to-balance-performance-with-multi-tenancy.md", 
+                    "title": "Controlling Parallelization to Balance Performance with Multi-Tenancy", 
+                    "url": "/docs/controlling-parallelization-to-balance-performance-with-multi-tenancy/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Query Profiles", 
+                    "next_url": "/docs/query-profiles/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Controlling Parallelization to Balance Performance with Multi-Tenancy", 
+                    "previous_url": "/docs/controlling-parallelization-to-balance-performance-with-multi-tenancy/", 
+                    "relative_path": "_docs/performance-tuning/where-to-identify-performance-issues/010-query-plans.md", 
+                    "title": "Query Plans", 
+                    "url": "/docs/query-plans/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Log and Debug", 
+                    "next_url": "/docs/log-and-debug/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Query Plans", 
+                    "previous_url": "/docs/query-plans/", 
+                    "relative_path": "_docs/performance-tuning/where-to-identify-performance-issues/020-query-profiles.md", 
+                    "title": "Query Profiles", 
+                    "url": "/docs/query-profiles/"
                 }
             ], 
             "next_title": "Performance Tuning Introduction", 
@@ -4868,8 +5157,8 @@
                 }
             ], 
             "children": [], 
-            "next_title": "Log and Debug", 
-            "next_url": "/docs/log-and-debug/", 
+            "next_title": "Partition Pruning", 
+            "next_url": "/docs/partition-pruning/", 
             "parent": "Performance Tuning", 
             "previous_title": "Performance Tuning", 
             "previous_url": "/docs/performance-tuning/", 
@@ -5445,6 +5734,57 @@
             "title": "Query Directory Functions", 
             "url": "/docs/query-directory-functions/"
         }, 
+        "Query Plans": {
+            "breadcrumbs": [
+                {
+                    "title": "Performance Tuning", 
+                    "url": "/docs/performance-tuning/"
+                }
+            ], 
+            "children": [], 
+            "next_title": "Query Profiles", 
+            "next_url": "/docs/query-profiles/", 
+            "parent": "Performance Tuning", 
+            "previous_title": "Controlling Parallelization to Balance Performance with Multi-Tenancy", 
+            "previous_url": "/docs/controlling-parallelization-to-balance-performance-with-multi-tenancy/", 
+            "relative_path": "_docs/performance-tuning/where-to-identify-performance-issues/010-query-plans.md", 
+            "title": "Query Plans", 
+            "url": "/docs/query-plans/"
+        }, 
+        "Query Plans and Tuning Introduction": {
+            "breadcrumbs": [
+                {
+                    "title": "Performance Tuning", 
+                    "url": "/docs/performance-tuning/"
+                }
+            ], 
+            "children": [], 
+            "next_title": "Join Planning Guidelines", 
+            "next_url": "/docs/join-planning-guidelines/", 
+            "parent": "Performance Tuning", 
+            "previous_title": "Choosing a Storage Format", 
+            "previous_url": "/docs/choosing-a-storage-format/", 
+            "relative_path": "_docs/performance-tuning/query-plans-and-tuning/010-query-plans-and-tuning-introduction.md", 
+            "title": "Query Plans and Tuning Introduction", 
+            "url": "/docs/query-plans-and-tuning-introduction/"
+        }, 
+        "Query Profiles": {
+            "breadcrumbs": [
+                {
+                    "title": "Performance Tuning", 
+                    "url": "/docs/performance-tuning/"
+                }
+            ], 
+            "children": [], 
+            "next_title": "Log and Debug", 
+            "next_url": "/docs/log-and-debug/", 
+            "parent": "Performance Tuning", 
+            "previous_title": "Query Plans", 
+            "previous_url": "/docs/query-plans/", 
+            "relative_path": "_docs/performance-tuning/where-to-identify-performance-issues/020-query-profiles.md", 
+            "title": "Query Profiles", 
+            "url": "/docs/query-profiles/"
+        }, 
         "Query Stages": {
             "breadcrumbs": [
                 {
@@ -8315,6 +8655,23 @@
             "title": "Selecting Nested Data for a Column", 
             "url": "/docs/selecting-nested-data-for-a-column/"
         }, 
+        "Sort-Based and Hash-Based Memory-Constrained Operators": {
+            "breadcrumbs": [
+                {
+                    "title": "Performance Tuning", 
+                    "url": "/docs/performance-tuning/"
+                }
+            ], 
+            "children": [], 
+            "next_title": "Enabling Query Queuing", 
+            "next_url": "/docs/enabling-query-queuing/", 
+            "parent": "Performance Tuning", 
+            "previous_title": "Modifying Query Planning Options", 
+            "previous_url": "/docs/modifying-query-planning-options/", 
+            "relative_path": "_docs/performance-tuning/query-plans-and-tuning/050-sort-based-and-hash-based-memory-constrained-operators.md", 
+            "title": "Sort-Based and Hash-Based Memory-Constrained Operators", 
+            "url": "/docs/sort-based-and-hash-based-memory-constrained-operators/"
+        }, 
         "Start-Up Options": {
             "breadcrumbs": [
                 {
@@ -11542,14 +11899,201 @@
                         }
                     ], 
                     "children": [], 
-                    "next_title": "Log and Debug", 
-                    "next_url": "/docs/log-and-debug/", 
+                    "next_title": "Partition Pruning", 
+                    "next_url": "/docs/partition-pruning/", 
                     "parent": "Performance Tuning", 
                     "previous_title": "Performance Tuning", 
                     "previous_url": "/docs/performance-tuning/", 
                     "relative_path": "_docs/performance-tuning/010-performance-tuning-introduction.md", 
                     "title": "Performance Tuning Introduction", 
                     "url": "/docs/performance-tuning-introduction/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Choosing a Storage Format", 
+                    "next_url": "/docs/choosing-a-storage-format/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Performance Tuning Introduction", 
+                    "previous_url": "/docs/performance-tuning-introduction/", 
+                    "relative_path": "_docs/performance-tuning/020-partition-pruning.md", 
+                    "title": "Partition Pruning", 
+                    "url": "/docs/partition-pruning/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Query Plans and Tuning Introduction", 
+                    "next_url": "/docs/query-plans-and-tuning-introduction/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Partition Pruning", 
+                    "previous_url": "/docs/partition-pruning/", 
+                    "relative_path": "_docs/performance-tuning/030-choosing-a-storage-format.md", 
+                    "title": "Choosing a Storage Format", 
+                    "url": "/docs/choosing-a-storage-format/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Join Planning Guidelines", 
+                    "next_url": "/docs/join-planning-guidelines/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Choosing a Storage Format", 
+                    "previous_url": "/docs/choosing-a-storage-format/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/010-query-plans-and-tuning-introduction.md", 
+                    "title": "Query Plans and Tuning Introduction", 
+                    "url": "/docs/query-plans-and-tuning-introduction/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Guidelines for Optimizing Aggregation", 
+                    "next_url": "/docs/guidelines-for-optimizing-aggregation/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Query Plans and Tuning Introduction", 
+                    "previous_url": "/docs/query-plans-and-tuning-introduction/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/020-join-planning-guidelines.md", 
+                    "title": "Join Planning Guidelines", 
+                    "url": "/docs/join-planning-guidelines/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Modifying Query Planning Options", 
+                    "next_url": "/docs/modifying-query-planning-options/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Join Planning Guidelines", 
+                    "previous_url": "/docs/join-planning-guidelines/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/030-guidelines-for-optimizing-aggregation.md", 
+                    "title": "Guidelines for Optimizing Aggregation", 
+                    "url": "/docs/guidelines-for-optimizing-aggregation/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Sort-Based and Hash-Based Memory-Constrained Operators", 
+                    "next_url": "/docs/sort-based-and-hash-based-memory-constrained-operators/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Guidelines for Optimizing Aggregation", 
+                    "previous_url": "/docs/guidelines-for-optimizing-aggregation/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/040-modifying-query-planning-options.md", 
+                    "title": "Modifying Query Planning Options", 
+                    "url": "/docs/modifying-query-planning-options/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Enabling Query Queuing", 
+                    "next_url": "/docs/enabling-query-queuing/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Modifying Query Planning Options", 
+                    "previous_url": "/docs/modifying-query-planning-options/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/050-sort-based-and-hash-based-memory-constrained-operators.md", 
+                    "title": "Sort-Based and Hash-Based Memory-Constrained Operators", 
+                    "url": "/docs/sort-based-and-hash-based-memory-constrained-operators/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Controlling Parallelization to Balance Performance with Multi-Tenancy", 
+                    "next_url": "/docs/controlling-parallelization-to-balance-performance-with-multi-tenancy/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Sort-Based and Hash-Based Memory-Constrained Operators", 
+                    "previous_url": "/docs/sort-based-and-hash-based-memory-constrained-operators/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/060-enabling-query-queuing.md", 
+                    "title": "Enabling Query Queuing", 
+                    "url": "/docs/enabling-query-queuing/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Query Plans", 
+                    "next_url": "/docs/query-plans/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Enabling Query Queuing", 
+                    "previous_url": "/docs/enabling-query-queuing/", 
+                    "relative_path": "_docs/performance-tuning/query-plans-and-tuning/070-controlling-parallelization-to-balance-performance-with-multi-tenancy.md", 
+                    "title": "Controlling Parallelization to Balance Performance with Multi-Tenancy", 
+                    "url": "/docs/controlling-parallelization-to-balance-performance-with-multi-tenancy/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Query Profiles", 
+                    "next_url": "/docs/query-profiles/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Controlling Parallelization to Balance Performance with Multi-Tenancy", 
+                    "previous_url": "/docs/controlling-parallelization-to-balance-performance-with-multi-tenancy/", 
+                    "relative_path": "_docs/performance-tuning/where-to-identify-performance-issues/010-query-plans.md", 
+                    "title": "Query Plans", 
+                    "url": "/docs/query-plans/"
+                }, 
+                {
+                    "breadcrumbs": [
+                        {
+                            "title": "Performance Tuning", 
+                            "url": "/docs/performance-tuning/"
+                        }
+                    ], 
+                    "children": [], 
+                    "next_title": "Log and Debug", 
+                    "next_url": "/docs/log-and-debug/", 
+                    "parent": "Performance Tuning", 
+                    "previous_title": "Query Plans", 
+                    "previous_url": "/docs/query-plans/", 
+                    "relative_path": "_docs/performance-tuning/where-to-identify-performance-issues/020-query-profiles.md", 
+                    "title": "Query Profiles", 
+                    "url": "/docs/query-profiles/"
                 }
             ], 
             "next_title": "Performance Tuning Introduction", 
@@ -11567,8 +12111,8 @@
             "next_title": "Query Audit Logging", 
             "next_url": "/docs/query-audit-logging/", 
             "parent": "", 
-            "previous_title": "Performance Tuning Introduction", 
-            "previous_url": "/docs/performance-tuning-introduction/", 
+            "previous_title": "Query Profiles", 
+            "previous_url": "/docs/query-profiles/", 
             "relative_path": "_docs/073-log-and-debug.md", 
             "title": "Log and Debug", 
             "url": "/docs/log-and-debug/"

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/data_skew.png
----------------------------------------------------------------------
diff --git a/_docs/img/data_skew.png b/_docs/img/data_skew.png
new file mode 100755
index 0000000..97b8121
Binary files /dev/null and b/_docs/img/data_skew.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/frag_profile.png
----------------------------------------------------------------------
diff --git a/_docs/img/frag_profile.png b/_docs/img/frag_profile.png
new file mode 100755
index 0000000..9885534
Binary files /dev/null and b/_docs/img/frag_profile.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/graph_1.png
----------------------------------------------------------------------
diff --git a/_docs/img/graph_1.png b/_docs/img/graph_1.png
new file mode 100755
index 0000000..52d1216
Binary files /dev/null and b/_docs/img/graph_1.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/list_queries.png
----------------------------------------------------------------------
diff --git a/_docs/img/list_queries.png b/_docs/img/list_queries.png
new file mode 100755
index 0000000..f892997
Binary files /dev/null and b/_docs/img/list_queries.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/maj_frag_block.png
----------------------------------------------------------------------
diff --git a/_docs/img/maj_frag_block.png b/_docs/img/maj_frag_block.png
new file mode 100755
index 0000000..71d3407
Binary files /dev/null and b/_docs/img/maj_frag_block.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/operator_block.png
----------------------------------------------------------------------
diff --git a/_docs/img/operator_block.png b/_docs/img/operator_block.png
new file mode 100755
index 0000000..26e2ce8
Binary files /dev/null and b/_docs/img/operator_block.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/operator_table.png
----------------------------------------------------------------------
diff --git a/_docs/img/operator_table.png b/_docs/img/operator_table.png
new file mode 100755
index 0000000..f9539bd
Binary files /dev/null and b/_docs/img/operator_table.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/phys_plan_profile.png
----------------------------------------------------------------------
diff --git a/_docs/img/phys_plan_profile.png b/_docs/img/phys_plan_profile.png
new file mode 100755
index 0000000..c26c8d4
Binary files /dev/null and b/_docs/img/phys_plan_profile.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/query_profile.png
----------------------------------------------------------------------
diff --git a/_docs/img/query_profile.png b/_docs/img/query_profile.png
new file mode 100755
index 0000000..9438f25
Binary files /dev/null and b/_docs/img/query_profile.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/query_queuing.png
----------------------------------------------------------------------
diff --git a/_docs/img/query_queuing.png b/_docs/img/query_queuing.png
new file mode 100755
index 0000000..7964d93
Binary files /dev/null and b/_docs/img/query_queuing.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/submit_plan.png
----------------------------------------------------------------------
diff --git a/_docs/img/submit_plan.png b/_docs/img/submit_plan.png
new file mode 100755
index 0000000..4ba89ed
Binary files /dev/null and b/_docs/img/submit_plan.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/vis_graph.png
----------------------------------------------------------------------
diff --git a/_docs/img/vis_graph.png b/_docs/img/vis_graph.png
new file mode 100755
index 0000000..25488c9
Binary files /dev/null and b/_docs/img/vis_graph.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/img/xx-xx-xx.png
----------------------------------------------------------------------
diff --git a/_docs/img/xx-xx-xx.png b/_docs/img/xx-xx-xx.png
new file mode 100755
index 0000000..f5754ba
Binary files /dev/null and b/_docs/img/xx-xx-xx.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/performance-tuning/020-partition-pruning.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/020-partition-pruning.md b/_docs/performance-tuning/020-partition-pruning.md
new file mode 100755
index 0000000..8babc8d
--- /dev/null
+++ b/_docs/performance-tuning/020-partition-pruning.md
@@ -0,0 +1,96 @@
+---
+title: "Partition Pruning"
+parent: "Performance Tuning"
+--- 
+
+Partition pruning is a performance optimization that limits the number of files and partitions that Drill reads when querying file systems and Hive tables. When you partition data, Drill only reads a subset of the files that reside in a file system or a subset of the partitions in a Hive table when a query matches certain filter criteria.
+ 
+The query planner in Drill evaluates the filters as part of a Filter operator. If no partition filters are present, the underlying Scan operator reads all files in all directories and then sends the data to operators downstream, such as Filter. When partition filters are present, the query planner determines if it can push the filters down to the Scan such that the Scan only reads the directories that match the partition filters, thus reducing disk I/O.
+
+## Determining a Partitioning Scheme  
+
+You can organize your data in such a way that maximizes partition pruning in Drill to optimize performance. Currently, you must partition data manually for a query to take advantage of partition pruning in Drill.
+ 
+Partitioning data requires you to determine a partitioning scheme, or a logical way to store the data in a hierarchy of directories. You can then use CTAS to create Parquet files from the original data, specifying filter conditions, and then move the files into the correlating directories in the hierarchy. Once you have partitioned the data, you can create and query views on the data.
+ 
+Partitioning Example
+For example, if you have several text files with log data which span multiple years, and you want to partition the data by year and quarter, you could create the following hierarchy of directories:  
+       
+       …/logs/1994/Q1  
+       …/logs/1994/Q2  
+       …/logs/1994/Q3  
+       …/logs/1994/Q4  
+       …/logs/1995/Q1  
+       …/logs/1995/Q2  
+       …/logs/1995/Q3  
+       …/logs/1995/Q4  
+       …/logs/1996/Q1  
+       …/logs/1996/Q2  
+       …/logs/1996/Q3  
+       …/logs/1996/Q4  
+
+Once the directory structure is in place, run CTAS with a filter condition in the year and quarter for Q1 1994.
+ 
+          CREATE TABLE TT_1994_Q1 
+              AS SELECT * FROM <raw table data in text format >
+              WHERE columns[1] = 1994 AND columns[2] = 'Q1'
+ 
+This creates a Parquet file with the log data for Q1 1994 in the current workspace.  You can then move the file into the correlating directory, and repeat the process until all of the files are stored in their respective directories.
+
+Now you can define views on the parquet files and query the views.  
+
+       0: jdbc:drill:zk=local> create view vv1 as select `dir0` as `year`, `dir1` as `qtr` from dfs.`/Users/max/data/multilevel/parquet`;
+       +------------+------------+
+       |     ok     |  summary   |
+       +------------+------------+
+       | true       | View 'vv1' created successfully in 'dfs.tmp' schema |
+       +------------+------------+
+       1 row selected (0.16 seconds)  
+
+Query the view to see all of the logs.  
+
+       0: jdbc:drill:zk=local> select * from dfs.tmp.vv1;
+       +------------+------------+
+       |    year    |    qtr     |
+       +------------+------------+
+       | 1994       | Q1         |
+       | 1994       | Q3         |
+       | 1994       | Q3         |
+       | 1994       | Q4         |
+       | 1994       | Q4         |
+       | 1994       | Q4         |
+       | 1994       | Q4         |
+       | 1995       | Q2         |
+       | 1995       | Q2         |
+       | 1995       | Q2         |
+       | 1995       | Q2         |
+       | 1995       | Q4         |
+       | 1995       | Q4         |
+       | 1995       | Q4         |
+       | 1995       | Q4         |
+       | 1995       | Q4         |
+       | 1995       | Q4         |
+       | 1995       | Q4         |
+       | 1996       | Q1         |
+       | 1996       | Q1         |
+       | 1996       | Q1         |
+       | 1996       | Q1         |
+       | 1996       | Q1         |
+       | 1996       | Q2         |
+       | 1996       | Q3         |
+       | 1996       | Q3         |
+       | 1996       | Q3         |
+       +------------+------------+
+       ...
+
+
+When you query the view, Drill can apply partition pruning and read only the files and directories required to return query results.
+
+       0: jdbc:drill:zk=local> explain plan for select * from dfs.tmp.vv1 where `year` = 1996 and qtr = 'Q2';
+       +------------+------------+
+       |    text    |    json    |
+       +------------+------------+
+       | 00-00    Screen
+       00-01      Project(year=[$0], qtr=[$1])
+       00-02        Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=file:/Users/maxdata/multilevel/parquet/1996/Q2/orders_96_q2.parquet]], selectionRoot=/Users/max/data/multilevel/parquet, numFiles=1, columns=[`dir0`, `dir1`]]])
+       

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/performance-tuning/030-choosing-a-storage-format.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/030-choosing-a-storage-format.md b/_docs/performance-tuning/030-choosing-a-storage-format.md
new file mode 100755
index 0000000..d339bc3
--- /dev/null
+++ b/_docs/performance-tuning/030-choosing-a-storage-format.md
@@ -0,0 +1,16 @@
+---
+title: "Choosing a Storage Format"
+parent: "Performance Tuning"
+--- 
+Drill supports several file formats for data including CSV, TSV, PSV, JSON, and Parquet. Changing the default format is a typical functional change that can optimize performance. Drill runs fastest against Parquet files because Parquet data representation is almost identical to how Drill represents data.
+
+Optimized for working with large files, Parquet arranges data in columns, putting related values in close proximity to each other to optimize query performance, minimize I/O, and facilitate compression. Parquet detects and encodes the same or similar data using a technique that conserves resources.
+
+When using Parquet as the storage format, balance the number of files against the file size to achieve maximum parallelization. See [Configuring the Size of Parquet Files]({{ site.baseurl }}/docs/parquet-format/#configuring-the-size-of-parquet-files).  
+
+When a read of Parquet data occurs, Drill loads only the necessary columns of data, which reduces I/O. Reading only a small piece of the Parquet data from a data file or table, Drill can examine and analyze all values for a column across multiple files.
+ 
+Because SQL does not support all Parquet data types, to prevent Drill from inferring a type other than the one you want, you can use the [CAST or CONVERT functions]({{ site.baseurl }}/docs/data-type-conversion/#cast). See [Data Type Conversion]({{ site.baseurl }}/docs/data-type-conversion/).
+ 
+See [Parquet Format]({{ site.baseurl }}/docs/parquet-format/) for more information about Parquet with Drill. You may also be interested in the [JSON Data Model]({{ site.baseurl }}/docs/json-data-model/), [Data Sources and File Formats Introduction]({{ site.baseurl }}/docs/data-sources-and-file-formats-introduction/), and [Supported Data Types]({{ site.baseurl }}/docs/supported-data-types/).
+

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/performance-tuning/query-plans-and-tuning/010-query-plans-and-tuning-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/query-plans-and-tuning/010-query-plans-and-tuning-introduction.md b/_docs/performance-tuning/query-plans-and-tuning/010-query-plans-and-tuning-introduction.md
new file mode 100755
index 0000000..0461b06
--- /dev/null
+++ b/_docs/performance-tuning/query-plans-and-tuning/010-query-plans-and-tuning-introduction.md
@@ -0,0 +1,7 @@
+---
+title: "Query Plans and Tuning Introduction"
+parent: "Performance Tuning"
+---
+
+You can modify several options that affect how Drill plans a query.  This section describes some options that you can modify to improve performance.  
+

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/performance-tuning/query-plans-and-tuning/020-join-planning-guidelines.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/query-plans-and-tuning/020-join-planning-guidelines.md b/_docs/performance-tuning/query-plans-and-tuning/020-join-planning-guidelines.md
new file mode 100755
index 0000000..aa8a1ce
--- /dev/null
+++ b/_docs/performance-tuning/query-plans-and-tuning/020-join-planning-guidelines.md
@@ -0,0 +1,43 @@
+---
+title: "Join Planning Guidelines"
+parent: "Performance Tuning"
+--- 
+
+Drill uses distributed and broadcast joins to join tables. You can modify configuration settings in Drill to control how Drill plans joins in a query.
+
+## Distributed Joins
+For a distributed join, both sides of the join are hash distributed using one of the hash-based distribution operators on the join key. See Operators. 
+
+If there are multiple join keys from each table, Drill considers the two following types of plans:  
+1. A plan where data is distributed on all keys.  
+2. A plan where data is distributed on each individual key.  
+ 
+For a merge join, Drill sorts both sides of the join after performing the hash distribution. Drill can distribute both sides of a hash join or merge join, but cannot do so for a nested loop join. 
+
+## Broadcast Joins
+In a broadcast join, all of the selected records of one file are broadcast to the file on all other nodes before the join is performed. The inner side of the join is broadcast while the outer side is kept as-is without any re-distribution. The estimated cardinality of the inner child must be below the planner.broadcast_threshold parameter in order to be eligible for broadcast.  Drill can use broadcast joins for hash, merge, and nested loop joins.
+ 
+A broadcast join is useful when a large (fact) table is being joined to a relatively smaller (dimension) table. If the fact table is stored as many files in the distributed file system, instead of re-distributing the fact table over the network, it may be substantially cheaper to broadcast the inner side.  However, the broadcast sends the same data to all other nodes in the cluster.  Depending on the size of the cluster and the size of the data, it may not be the most efficient policy in some situations.
+ 
+### Broadcast Join Options
+You can increase the size and affinity for Drill to use broadcast joins with the ALTER SYSTEM or ALTER SESSION commands and options. Typically, you set the options at the session level unless you want the setting to persist across all sessions.
+
+The following configuration options in Drill control broadcast join behavior:  
+
+* **planner.broadcast_factor** 
+
+     Controls the cost of doing a broadcast when performing a join.  The lower the setting, the cheaper it is to do a broadcast join compared to other types of distribution for a join, such as a hash distribution.  
+
+     Default:1 Range: 0-1.7976931348623157e+308
+
+* **planner.enable\_broadcast_join**  
+
+     Changes the state of aggregation and join operators. The broadcast join can be used for hash join, merge join, and nested loop join. Use to join a large (fact) table to relatively smaller (dimension) tables.  
+
+     Default: true 
+
+* **planner.broadcast_threshold**  
+
+    Threshold, in terms of a number of rows, that determines whether a broadcast join is chosen for a query. Regardless of the setting of the broadcast_join option (enabled or disabled), a broadcast join is not chosen unless the right side of the join is estimated to contain fewer rows than this threshold. The intent of this option is to avoid broadcasting too many rows for join purposes. Broadcasting involves sending data across nodes and is a network-intensive operation. (The "right side" of the join, which may itself be a join or simply a table, is determined by cost-based optimizations and heuristics during physical planning.)  
+    
+    Default: 10000000 Range: 0-2147483647

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/performance-tuning/query-plans-and-tuning/030-guidelines-for-optimizing-aggregation.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/query-plans-and-tuning/030-guidelines-for-optimizing-aggregation.md b/_docs/performance-tuning/query-plans-and-tuning/030-guidelines-for-optimizing-aggregation.md
new file mode 100755
index 0000000..4b81b42
--- /dev/null
+++ b/_docs/performance-tuning/query-plans-and-tuning/030-guidelines-for-optimizing-aggregation.md
@@ -0,0 +1,21 @@
+---
+title: "Guidelines for Optimizing Aggregation"
+parent: "Performance Tuning"
+--- 
+
+
+For queries that contain GROUP BY, Drill performs aggregations in either 1 or 2 phases.  In both of these schemes, Drill can use the Hash Aggregate and Streaming Aggregate physical operators.  The default behavior in Drill is to perform 2 phase aggregation.  
+ 
+In the 2 phase aggregation scheme, each minor fragment performs local (partial) aggregation in phase 1.  It then sends the partially aggregated results to other fragments using a hash-based distribution operator.  The hash distribution is done on the GROUP BY keys.  In phase 2 all of the fragments perform a total aggregation using data received from phase 1.  
+ 
+The 2 phase aggregation scheme is very efficient when the data contains grouping keys with a reasonable number of duplicate values such that doing the grouping reduces the number of rows sent to downstream operators.  However, if there is not much reduction it is best to use 1 phase aggregation.   
+ 
+For example, suppose the query does a GROUP BY x, y.  If the combination of {x, y} values is unique (or nearly unique) in all of the rows of the input data, then there is no reduction in the number of rows when performing the grouping.  In this case, performance improves by doing 1 phase aggregation.  
+ 
+You can use the ALTER SYSTEM or ALTER SESSION commands with the following option to control aggregation in Drill:
+
+*  planner.enable\_multiphase\_agg 
+
+ 
+The default for this option is `true`.Typically, you set the options at the session level unless you want the setting to persist across all sessions.
+ 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/performance-tuning/query-plans-and-tuning/040-modifying-query-planning-options.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/query-plans-and-tuning/040-modifying-query-planning-options.md b/_docs/performance-tuning/query-plans-and-tuning/040-modifying-query-planning-options.md
new file mode 100755
index 0000000..fea76e3
--- /dev/null
+++ b/_docs/performance-tuning/query-plans-and-tuning/040-modifying-query-planning-options.md
@@ -0,0 +1,34 @@
+---
+title: "Modifying Query Planning Options"
+parent: "Performance Tuning"
+--- 
+
+Planner options affect how Drill plans a query. You can use the ALTER SYSTEM|SESSION commands to modify certain planning options to optimize query plans and improve performance.  Typically, you modify options at the session level. See [ALTER SESSION]({{ site.baseurl }}/docs/alter-session/) for details on how to run the command.
+ 
+The following planning options affect query planning and performance:
+
+* **planner.width.max\_per_node** 
+
+     Default is 3. Configure this option to achieve fine grained, absolute control over parallelization.
+
+     In this context width refers to fan out or distribution potential: the ability to run a query in parallel across the cores on a node and the nodes on a cluster. A physical plan consists of intermediate operations, known as query "fragments," that run concurrently, yielding opportunities for parallelism above and below each exchange operator in the plan. An exchange operator represents a breakpoint in the execution flow where processing can be distributed. For example, a single-process scan of a file may flow into an exchange operator, followed by a multi-process aggregation fragment.
+ 
+     The maximum width per node defines the maximum degree of parallelism for any fragment of a query, but the setting applies at the level of a single node in the cluster. The default maximum degree of parallelism per node is calculated as follows, with the theoretical maximum automatically scaled back (and rounded down) so that only 70% of the actual available capacity is taken into account: number of active drillbits (typically one per node) * number of cores per node * 0.7
+ 
+     For example, on a single-node test system with 2 cores and hyper-threading enabled: 1 * 4 * 0.7 = 3.
+     When you modify the default setting, you can supply any meaningful number. The system does not automatically scale down your setting.  
+
+* **planner.width\_max\_per_query**  
+
+     Default is 1000. The maximum number of threads than can run in parallel for a query across all nodes. Only change this setting when Drill over-parallelizes on very large clusters.
+ 
+* **planner.slice_target**  
+
+     Default is 100000. The minimum number of estimated records to work with in a major fragment before applying additional parallelization.
+ 
+* **planner.broadcast_threshold**  
+
+     Default is 10000000. The maximum number of records allowed to be broadcast as part of a join. After one million records, Drill reshuffles data rather than doing a broadcast to one side of the join. To improve performance you can increase this number, especially on 10GB Ethernet clusters.
+ 
+
+

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/performance-tuning/query-plans-and-tuning/050-sort-based-and-hash-based-memory-constrained-operators.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/query-plans-and-tuning/050-sort-based-and-hash-based-memory-constrained-operators.md b/_docs/performance-tuning/query-plans-and-tuning/050-sort-based-and-hash-based-memory-constrained-operators.md
new file mode 100755
index 0000000..561aba9
--- /dev/null
+++ b/_docs/performance-tuning/query-plans-and-tuning/050-sort-based-and-hash-based-memory-constrained-operators.md
@@ -0,0 +1,39 @@
+---
+title: "Sort-Based and Hash-Based Memory-Constrained Operators"
+parent: "Performance Tuning"
+--- 
+
+Drill uses hash-based and sort-based operators depending on the query characteristics. Hash aggregation and hash join are hash-based operations. Streaming aggregation and merge join are sort-based operations. Both hash-based and sort-based operations consume memory, however the hash aggregate and hash join operators are the fastest and most memory intensive operators.
+ 
+Currently, hash-based operations do not spill to disk as needed, but the sort-based operations do. When Drill plans a sort-based query, it evaluates the size of available memory multiplied by a configurable reduction constant (for parallelization purposes) and then limits the sort-based operations to the maximum of this amount of memory.
+
+If the hash-based operators run out of memory during execution, the query fails. If large hash operations do not fit in memory on your system, you can disable these operations. When disabled, Drill creates alternative plans that allow spilling to disk.
+
+You can also modify the minimum hash table size, increasing the size for very large aggregations or joins when you have large amounts of memory for Drill to use. If you have large data sets, you can increase this hash table size to improve performance.
+ 
+Use the ALTER SYSTEM or ALTER SESSION commands with the options in the table below to disable the hash aggregate and hash join operators, modify the hash table size, disable memory estimation, or set the estimated maximum amount of memory for a query. Typically, you set the options at the session level unless you want the setting to persist across all sessions.
+
+The following options control the hash-based operators:
+
+* **planner.enable_hashagg**  
+    Enable hash aggregation; otherwise, Drill does a sort-based aggregation. Does not write to disk. Enable is recommended. Default: true
+
+* **planner.enable_hashjoin**  
+    Enable the memory hungry hash join. Drill assumes that a query will have adequate memory to complete and tries to use the fastest operations possible to complete the planned inner, left, right, or full outer joins using a hash table. Does not write to disk. Disabling hash join allows Drill to manage arbitrarily large data in a small memory footprint. Default: true
+
+* **exec.min_hash_table_size**  
+    Starting size for hash tables. Increase according to available memory to improve performance.  
+    Default: 65536 Range: 0 - 1073741824
+
+* **exec.max\_hash\_table_size**  
+    Ending size for hash tables.  
+    Default: 1073741824 Range: 0 - 1073741824
+
+* **planner.memory.enable\_memory_estimation**  
+    Toggles the state of memory estimation and re-planning of the query. When enabled, Drill conservatively estimates memory requirements and typically excludes memory-constrained operators from the plan and negatively impacts performance.  
+    Default: false
+
+
+* **planner.memory.max\_query\_memory\_per_node**  
+    Sets the maximum estimate of memory for a query per node. If the estimate is too low, Drill re-plans the query without memory-constrained operators.  
+    Default: 2147483648

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/performance-tuning/query-plans-and-tuning/060-enabling-query-queuing.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/query-plans-and-tuning/060-enabling-query-queuing.md b/_docs/performance-tuning/query-plans-and-tuning/060-enabling-query-queuing.md
new file mode 100755
index 0000000..671c1cd
--- /dev/null
+++ b/_docs/performance-tuning/query-plans-and-tuning/060-enabling-query-queuing.md
@@ -0,0 +1,48 @@
+---
+title: "Enabling Query Queuing"
+parent: "Performance Tuning"
+--- 
+
+Drill runs all queries concurrently by default. However, Drill performance increases when a small number of queries run concurrently. You can enable query queues to limit the maximum number of queries that run concurrently. Splitting large queries into multiple small queries and enabling query queuing improves query performance.
+ 
+When you enable query queuing, you configure large and small queues. Drill determines which queue to route a query to at runtime based on the size of the query. Drill can quickly complete the queries and then continue on to the next set of queries.
+
+## Example Configuration  
+
+For example, you configure the queue reserved for large queries for a 5-query maximum. You configure the queue reserved for small queries for 20 queries. Users start to run queries, and Drill receives the following query requests in this order:  
+
+* Query A (blue): 1 billion records, Drill estimates 10 million rows will be processed
+* Query B (red): 2 billion records, Drill estimates 20 million rows will be processed
+* Query C: 1 billion records
+* Query D: 100 records
+ 
+The exec.queue.threshold default is 30 million, which is the estimated rows to be processed by the query. Queries A and B are queued in the large queue. The estimated rows to be processed reaches the 30 million threshold, filling the queue to capacity. The query C request arrives and goes on the wait list, and then query D arrives. Query D is queued immediately in the small queue because of its small size, as shown in the following diagram:
+
+![]({{ site.baseurl }}/docs/img/query_queuing.png)  
+
+The Drill queuing configuration in this example tends to give many users running small queries a rapid response. Users running a large query might experience some delay until an earlier-received large query returns, freeing space in the large queue to process queries that are waiting.
+
+Use the ALTER SYSTEM or ALTER SESSION commands with the options below to enable query queuing and set the maximum number of queries that each queue allows. Typically, you set the options at the session level unless you want the setting to persist across all sessions.
+
+
+* **exec.queue.enable**  
+    Changes the state of query queues to control the number of queries that run simultaneously. When disabled, there is no limit on the number of concurrent queries.  
+    Default: false
+
+* **exec.queue.large**  
+    Sets the number of large queries that can run concurrently in the cluster.  
+    Range: 0-1000. Default: 10
+
+* **exec.queue.small**  
+    Sets the number of small queries that can run concurrently in the cluster. Range: 0-1001.  
+    Range: 0 - 1073741824 Default: 100
+
+* **exec.queue.threshold**  
+    Sets the cost threshold, which depends on the complexity of the queries in queue, for determining whether a query is large or small. Complex queries have higher thresholds.  
+    Range: 0-9223372036854775807 Default: 30000000
+
+* **exec.queue.timeout_millis**  
+    Indicates how long a query can wait in queue before the query fails.  
+    Range: 0-9223372036854775807 Default: 300000
+
+

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/performance-tuning/query-plans-and-tuning/070-controlling-parallelization-to-balance-performance-with-multi-tenancy.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/query-plans-and-tuning/070-controlling-parallelization-to-balance-performance-with-multi-tenancy.md b/_docs/performance-tuning/query-plans-and-tuning/070-controlling-parallelization-to-balance-performance-with-multi-tenancy.md
new file mode 100755
index 0000000..7fed3fd
--- /dev/null
+++ b/_docs/performance-tuning/query-plans-and-tuning/070-controlling-parallelization-to-balance-performance-with-multi-tenancy.md
@@ -0,0 +1,10 @@
+---
+title: "Controlling Parallelization to Balance Performance with Multi-Tenancy"
+parent: "Performance Tuning"
+--- 
+
+When you run Drill in a multi-tenant environment, (in conjunction with other workloads in a cluster, such as MapReduce) you may need to modify Drill settings and options to maximize performance, or reduce the allocated resources to other applications. See [Configuring Multi-Tenant Resources]({{ site.baseurl }}/docs/configuring-multitenant-resources/).
+Drill is memory intensive and therefore requires sufficient memory to run optimally. You can modify how much memory that you want allocated to Drill. Drill typically performs better with as much memory as possible. See [Configuring Drill Memory]({{ site.baseurl }}/docs/configuring-drill-memory/).
+ 
+Reducing the level of parallelism in Drill can also help to balance the workloads and avoid resource conflicts. See [Configuring Parallelization]({{ site.baseurl }}/docs/configuring-resources-for-a-shared-drillbit/#configuring-parallelization).
+

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/performance-tuning/where-to-identify-performance-issues/010-query-plans.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/where-to-identify-performance-issues/010-query-plans.md b/_docs/performance-tuning/where-to-identify-performance-issues/010-query-plans.md
new file mode 100755
index 0000000..9ebd0c2
--- /dev/null
+++ b/_docs/performance-tuning/where-to-identify-performance-issues/010-query-plans.md
@@ -0,0 +1,74 @@
+---
+title: "Query Plans"
+parent: "Performance Tuning"
+---
+If you experience performance issues in Drill, you can typically identify the source of the issues in the query plans or profiles. This section describes the logical plan and physical plans.
+
+## Query Plans  
+
+Drill has an optimizer and a parallelizer that work together to plan a query. Drill creates logical, physical, and execution plans based on the available statistics for an associated set of files or data sources. The number of running Drill nodes and configured runtime settings contribute to how Drill plans and executes a query.
+ 
+You can use [EXPLAIN commands]({{ site.baseurl }}/docs/explain-commands/) to view the logical and physical plans for a query, however you cannot view the execution plan. To see how Drill executed a query, you can view the query profile in the Drill Web UI at <drill_node_ip_address>:8047.
+
+### Logical Plan  
+
+A logical plan is a collection of logical operators that describe the work required to generate query results and define which data sources and operators to apply. The parser in Drill converts SQL operators into a logical operator syntax that Drill understands to create the logical plan. You can view the logical plan to see the planned operators. Modifying and resubmitting the logical plan to Drill (through submit_plan) is not very useful because Drill has not determined parallelization at this stage of planning.
+
+### Physical Plan  
+
+A physical plan describes the chosen physical execution plan for a query statement. The optimizer applies various types of rules to rearrange operators and functions into an optimal plan and then converts the logical plan into a physical plan that tells Drill how to execute the query.
+ 
+You can review a physical plan to troubleshoot issues, modify the plan, and then submit the plan back to Drill. For example, if you run into a casting error or you want to change the join ordering of tables to see if the query runs faster. You can modify the physical plan to address the issue and then submit it back to Drill and run the query.
+ 
+Drill transforms the physical plan into an execution tree of minor fragments that run simultaneously on the cluster to carry out execution tasks. See Query Execution. You can view the activity of the fragments that executed a query in the query profile. See Query Profiles.
+
+**Viewing the Physical Plan**  
+
+You can run the EXPLAIN command to view the physical plan for a query with or without costing formation. See EXPLAIN for Physical Plans and Costing Information. Analyze the cost-based query plan to identify the types of operators that Drill plans to use for the query and how much memory they will require. 
+
+Read the text output from bottom to top to understand the sequence of operators planned to execute the query. You can also view a visual representation of the physical plan in the Profile view of the Drill Web UI. See Query Profiles. You can modify the detailed JSON output, and submit it back to Drill through the Drill Web UI.
+
+The physical plan shows the major fragments and specific operators with correlating MajorFragmentIDs and OperatorIDs. See Operators. Major fragments are an abstract concept that represent a phase of the query execution. Major fragments do not perform any query tasks.
+ 
+The physical plan displays the IDs in the following format:
+ 
+<MajorFragmentID\> - <OperatorID\>
+ 
+For example, 00-02 where 00 is the MajorFragmentID and 02 is is the OperatorID.
+ 
+If you view the plan with costing information, you can see where the majority of resources, in terms of I/O, CPU, and memory, will be spent when Drill executes the query. If joining tables, your query plan should include broadcast joins.
+
+**Example EXPLAIN PLAN**
+  
+
+       0: jdbc:drill:zk=local> explain plan for select type t, count(distinct id) from dfs.`/home/donuts/donuts.json` where type='donut' group by type;
+       +------------+------------+
+       |   text    |   json    |
+       +------------+------------+
+       | 00-00 Screen
+       00-01   Project(t=[$0], EXPR$1=[$1])
+       00-02       Project(t=[$0], EXPR$1=[$1])
+       00-03       HashAgg(group=[{0}], EXPR$1=[COUNT($1)])
+       00-04           HashAgg(group=[{0, 1}])
+       00-05           SelectionVectorRemover
+       00-06               Filter(condition=[=($0, 'donut')])
+       00-07               Scan(groupscan=[EasyGroupScan [selectionRoot=/home/donuts/donuts.json, numFiles=1, columns=[`type`, `id`], files=[file:/home/donuts/donuts.json]]])...
+       …
+       
+         
+**Modifying and Submitting a Physical Plan to Drill**
+
+You can test the performance of a physical plan that Drill generates, modify the plan and then re-submit it to Drill. For example, you can modify the plan to change the join ordering of tables. You can also submit physical plans created outside of Drill through the Drill Web UI.
+ 
+**Note:** Only advanced users who know about query planning should modify and re-submit a physical plan.
+ 
+To modify and re-submit a physical plan to Drill, complete the following steps:  
+
+1. Run EXPLAIN PLAN FOR <query\> to see the physical plan for your query.  
+2. Copy the JSON output of the physical plan, and modify as needed.  
+3. Navigate to the Drill Web UI at <drill\_node\_ip_address\>:8047.  
+4. Select **Query** in the menu bar.  
+![]({{ site.baseurl }}/docs/img/submit_plan.png)  
+
+5. Select the **Physical Plan** radio button under Query Type.  
+6. Paste the physical plan into the Query field, and click **Submit**. Drill runs the plan and executes the query.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/bb0710bc/_docs/performance-tuning/where-to-identify-performance-issues/020-query-profiles.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/where-to-identify-performance-issues/020-query-profiles.md b/_docs/performance-tuning/where-to-identify-performance-issues/020-query-profiles.md
new file mode 100755
index 0000000..850d8e3
--- /dev/null
+++ b/_docs/performance-tuning/where-to-identify-performance-issues/020-query-profiles.md
@@ -0,0 +1,142 @@
+---
+title: "Query Profiles"
+parent: "Performance Tuning"
+---
+
+A profile is a summary of metrics collected for each query that Drill executes. Query profiles provide information that you can use to monitor and analyze query performance. Drill creates a query profile from major, minor, operator, and input stream profiles. Each major fragment profile consists of a list of minor fragment profiles. Each minor fragment profile consists of a list of operator profiles. An operator profile consists of a list of input stream profiles. 
+
+You can view aggregate statistics across profile lists in the Profile tab of the Drill Web UI at <drill\_node\_ip_address\>:8047. You can modify and resubmit queries, or cancel queries. For debugging purposes, you can use profiles in conjunction with Drill logs. See Log and Debug.
+ 
+Metrics in a query profile are associated with a coordinate system of IDs. Drill uses a coordinate system comprised of query, fragment, and operator identifiers to track query execution activities and resources. Drill assigns a unique QueryID to each query received and then assigns IDs to each fragment and operator that executes the query.
+ 
+**Example IDs**
+
+QueryID: 2aa98add-15b3-e155-5669-603c03bfde86
+ 
+Fragment and operator IDs:  
+
+![]({{ site.baseurl }}/docs/img/xx-xx-xx.png)  
+
+## Viewing a Query Profile  
+
+When you select the Profiles tab in the Drill Web UI at <drill\_node_ip\_address\>:8047, you see a list of the last 100 queries than have run or that are currently running in the cluster.  
+
+![]({{ site.baseurl }}/docs/img/list_queries.png)
+
+
+You can click on any query to see its profile.  
+
+![]({{ site.baseurl }}/docs/img/query_profile.png)  
+
+When you select a profile, notice that the URL in the address bar contains the QueryID. For example, 2aa98add-15b3-e155-5669-603c03bfde86 in the following URL:
+
+       http://<drill_node>:8047/profiles/2aa98add-15b3-e155-5669-603c03bfde86
+ 
+The Query Profile section in the Query profile summarizes a few key details about the query, including: 
+ 
+ * The state of the query, either running, completed, or failed.  
+ * The node operating as the Foreman; the Drillbit that receives a query from the client or application becomes the Foreman and drives the entire query. 
+ * The total number of minor fragments required to execute the query
+
+If you scroll down, you can see the Fragment Profiles and Operator Profiles sections. 
+ 
+## Fragment Profiles  
+
+Fragment profiles section provides an overview table, and a major fragment block for each major fragment that executed the query. Each row in the Overview table provides the number of minor fragments that Drill parallelized from each major fragment, as well as aggregate time and memory metrics for the minor fragments.  
+
+![]({{ site.baseurl }}/docs/img/frag_profile.png)  
+
+See Major Fragment Profiles Table for column descriptions.
+ 
+When you look at the fragment profiles, you may notice that some major fragments were parallelized into substantially fewer minor fragments, but happen to have the highest runtime.  Or, you may notice certain minor fragments have a higher peak memory than others. When you notice these variations in execution, you can delve deeper into the profile by looking at the major fragment blocks.
+ 
+Below the Overview table are major fragment blocks. Each of these blocks corresponds to a row in the Overview table. You can expand the blocks to see metrics for all of the minor fragments that were parallelized from each major fragment, including the host on which each minor fragment ran. Each row in the major fragment table presents the fragment state, time metrics, memory metrics, and aggregate input metrics of each minor fragment.  
+
+![]({{ site.baseurl }}/docs/img/maj_frag_block.png)  
+
+When looking at the minor fragment metrics, verify the state of the fragment. A fragment can have a “failed” state which could indicate an issue on the host. If the query itself fails, an operator may have run out of memory. If fragments running on a particular node are under performing, there may be multi-tenancy issues that you can address.
+ 
+You can also see a graph that illustrates the activity of major and minor fragments for the duration of the query.  
+
+![]({{ site.baseurl }}/docs/img/graph_1.png)  
+
+If you see “stair steps” in the graph, this indicates that the execution work of the fragments is not distributed evenly. Stair steps in the graph typically occur for non-local reads on data. To address this issue, you can increase data replication, rewrite the data, or file a JIRA to get help with the issue.
+ 
+This graph correlates with the visualized plan graph in the Visualized Plan tab. Each color in the graph corresponds to the activity of one major fragment.  
+
+![]({{ site.baseurl }}/docs/img/vis_graph.png)  
+
+The visualized plan illustrates color-coded major fragments divided and labeled with the names of the operators used to complete each phase of the query. Exchange operators separate each major fragment. These operators represent a point where Drill can execute operations below them in parallel.  
+
+## Operator Profiles  
+
+Operator profiles describe each operator that performed relational operations during query execution. The Operator Profiles section provides an Overview table of the aggregate time and memory metrics for each operator within a major fragment.  
+
+![]({{ site.baseurl }}/docs/img/operator_table.png)  
+
+See Operator Profiles Table for column descriptions.
+ 
+Identify the operations that consume a majority of time and memory. You can potentially modify options related to the specific operators to improve performance.
+ 
+Below the Overview table are operator blocks, which you can expand to see metrics for each operator. Each of these blocks corresponds to a row in the Overview table. Each row in the Operator block presents time and memory metrics, as well as aggregate input metrics for each minor fragment.  
+
+![]({{ site.baseurl }}/docs/img/operator_block.png)  
+
+See Operator Block for column descriptions.
+ 
+Drill uses batches of records as a basic unit of work. The batches are pipelined between each operation.  Record batches are no larger than 64k records. While the target size of one record batch is generally 256k, they can scale to many megabytes depending on the query plan and the width of the records.
+
+The Max Records number for each minor fragment should be almost equivalent. If one, or a very small number of minor fragments, perform the majority of the work, there may be data skew. To address data skew, you may need change settings related to table joins or partition data to balance the work.  
+
+### Data Skew Example
+The following query was run against TPC-DS data:
+
+       0: jdbc:drill:zk=local> select ss_customer_sk, count(*) as cnt from store_sales where ss_customer_sk is null or ss_customer_sk in (1, 2, 3, 4, 5) group by ss_customer_sk;
+       +-----------------+---------+
+       | ss_customer_sk  |   cnt   |
+       +-----------------+---------+
+       | null            | 129752  |
+       | 5               | 47      |
+       | 1               | 9       |
+       | 2               | 43      |
+       | 4               | 10      |
+       | 3               | 11      |
+       +-----------------+---------+
+       6 rows selected
+ 
+In the result set, notice that the 'null' group has 129752 values while others have roughly similar values.  
+
+Looking at the operator profile for the hash aggregate in major fragment 00, you can see that out of 8 minor fragments, only minor fragment 1 is processing a substantially larger number of records when compared to the other minor fragments.  
+
+![]({{ site.baseurl }}/docs/img/data_skew.png)  
+
+In this example, there is inherent skew present in the data. Other types of skew may not strictly be data dependent, but can be introduced by a sub-optimal hash function or other issues in the product. In either case, examining the query profile helps understand why a query is slow. In the first scenario, it may be possible to run separate queries for the skewed and non-skewed values. In the second scenario, it is better to seek technical support.  
+
+## Physical Plan View  
+
+The physical plan view provides statistics about the actual cost of the query operations in terms of memory, I/O, and CPU processing. You can use this profile to identify which operations consumed the majority of the resources during a query, modify the physical plan to address the cost-intensive operations, and submit the updated plan back to Drill. See [Costing Information]({{ site.baseurl }}/docs/explain/#costing-information).  
+
+![]({{ site.baseurl }}/docs/img/phys_plan_profile.png)  
+
+## Canceling a Query  
+
+You may want to cancel a query if it hangs or causes performance bottlenecks. You can cancel a query in the Profile tab of the Drill Web UI.
+ 
+To cancel a query from the Drill Web UI, complete the following steps:  
+
+1. Navigate to the Drill Web UI at <drill\_node_ip\_address\>:8047.
+The Drill node from which you access the Drill Web UI must have an active Drillbit running.
+2. Select Profiles in the toolbar.
+A list of running and completed queries appears.
+3. Click the query for which you want to see the profile.
+4. Select **Edit Query**.
+5. Click **Cancel** query to cancel the query.  
+
+The following message appears:  
+
+       Cancelled query <QueryID\>
+
+
+
+
+


Mime
View raw message