drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject [4/5] drill git commit: add info from RN, wordsmith
Date Wed, 01 Jul 2015 01:33:26 GMT
add info from RN, wordsmith


Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/951a7b2e
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/951a7b2e
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/951a7b2e

Branch: refs/heads/gh-pages
Commit: 951a7b2e23e3e88a21c8cc85050b3aa341c7c62a
Parents: 75390dd
Author: Kristine Hahn <khahn@maprtech.com>
Authored: Tue Jun 30 16:14:14 2015 -0700
Committer: Kristine Hahn <khahn@maprtech.com>
Committed: Tue Jun 30 16:14:14 2015 -0700

----------------------------------------------------------------------
 _docs/performance-tuning/020-partition-pruning.md | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill/blob/951a7b2e/_docs/performance-tuning/020-partition-pruning.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/020-partition-pruning.md b/_docs/performance-tuning/020-partition-pruning.md
index 3dd6816..5a5669e 100755
--- a/_docs/performance-tuning/020-partition-pruning.md
+++ b/_docs/performance-tuning/020-partition-pruning.md
@@ -12,21 +12,23 @@ The query planner in Drill performs partition pruning by evaluating the
filters.
 You can partition data manually or automatically to take advantage of partition pruning in
Drill. In Drill 1.0 and earlier, you need to organize your data in such a way to take advantage
of partition pruning. In Drill 1.1.0 and later, if the data source is Parquet, you can partition
data automatically using CTAS--no data organization tasks required. 
 
 ## Automatic Partitioning
-Automatic partitioning in Drill 1.1.0 and later occurs when you write Parquet date using
the [PARTITION BY]({{site.baseurl}}/docs/partition-by-clause/) clause in the CTAS statement.
+Automatic partitioning in Drill 1.1 and later occurs when you write Parquet data using the
[PARTITION BY]({{site.baseurl}}/docs/partition-by-clause/) clause in the CTAS statement. Unlike
manual partitioning, no view is required, nor is it necessary to use the [dir* variables]({{site.baseurl}}/docs/querying-directories).
The Parquet writer first sorts by the partition keys, and then creates a new file when it
encounters a new value for the partition columns.
 
 Automatic partitioning creates separate files, but not separate directories, for different
partitions. Each file contains exactly one partition value, but there can be multiple files
for the same partition value.
 
-Partition pruning uses the Parquet column statistics to determine which columns to use to
prune.
+Partition pruning uses the Parquet column statistics to determine which columns to use to
prune. 
 
 ## Manual Partitioning
+
+Manual partitioning is directory-based. You perform the following steps to manually partition
data.   
  
 1. Devise a logical way to store the data in a hierarchy of directories. 
 2. Use CTAS to create Parquet files from the original data, specifying filter conditions.
 3. Move the files into directories in the hierarchy. 
 
-After partitioning the data, create and query views on the data.
+After partitioning the data, you need to create a view of the partitioned data to query the
data. You can use the [dir* variables]({{site.baseurl}}/docs/querying-directories) in queries
to refer to subdirectories in your workspace path.
  
-### Manual Partitioning
+### Manual Partitioning Example
 
 Suppose you have text files containing several years of log data. To partition the data by
year and quarter, create the following hierarchy of directories:  
        


Mime
View raw message