Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 59E37200B99 for ; Wed, 31 Aug 2016 00:29:18 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 58857160AC5; Tue, 30 Aug 2016 22:29:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7518A160AD8 for ; Wed, 31 Aug 2016 00:29:17 +0200 (CEST) Received: (qmail 81716 invoked by uid 500); 30 Aug 2016 22:29:16 -0000 Mailing-List: contact commits-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: commits@drill.apache.org Delivered-To: mailing list commits@drill.apache.org Received: (qmail 81353 invoked by uid 99); 30 Aug 2016 22:29:16 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Aug 2016 22:29:16 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 50A90E08AF; Tue, 30 Aug 2016 22:29:16 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: bridgetb@apache.org To: commits@drill.apache.org Date: Tue, 30 Aug 2016 22:29:26 -0000 Message-Id: <895cc728a6fd4154942ab065ccfedcc0@git.apache.org> In-Reply-To: <15dc4aee871c42f8ac6638c67a0254fb@git.apache.org> References: <15dc4aee871c42f8ac6638c67a0254fb@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [11/17] drill git commit: update to partition pruning intro to include refresh command for metadata cache file archived-at: Tue, 30 Aug 2016 22:29:18 -0000 update to partition pruning intro to include refresh command for metadata cache file Project: http://git-wip-us.apache.org/repos/asf/drill/repo Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/2bc38da0 Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/2bc38da0 Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/2bc38da0 Branch: refs/heads/gh-pages Commit: 2bc38da0e9ff9159b2337f3285aaaae05e5979aa Parents: 21c41f5 Author: Bridget Bevens Authored: Thu Aug 11 12:02:19 2016 -0700 Committer: Bridget Bevens Committed: Thu Aug 11 12:02:19 2016 -0700 ---------------------------------------------------------------------- .../partition-pruning/010-partition-pruning-introduction.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/drill/blob/2bc38da0/_docs/performance-tuning/partition-pruning/010-partition-pruning-introduction.md ---------------------------------------------------------------------- diff --git a/_docs/performance-tuning/partition-pruning/010-partition-pruning-introduction.md b/_docs/performance-tuning/partition-pruning/010-partition-pruning-introduction.md index 315b062..e5f4e5f 100644 --- a/_docs/performance-tuning/partition-pruning/010-partition-pruning-introduction.md +++ b/_docs/performance-tuning/partition-pruning/010-partition-pruning-introduction.md @@ -1,13 +1,12 @@ --- title: "Partition Pruning Introduction" -date: 2016-08-08 18:42:19 UTC +date: 2016-08-11 19:02:20 UTC parent: "Partition Pruning" --- Partition pruning is a performance optimization that limits the number of files and partitions that Drill reads when querying file systems and Hive tables. When you partition data, Drill only reads a subset of the files that reside in a file system or a subset of the partitions in a Hive table when a query matches certain filter criteria. -As of Drill 1.8, partition pruning also applies to the parquet metadata cache. See [Optimizing Parquet Metadata Reading]({{site.baseurl}}/docs/optimizing-parquet-metadata-reading/) to see how to create a parquet metadata cache. When data is partitioned in a directory hierarchy, Drill attempts to read the metadata cache file from a sub-partition, based on matching filter criteria instead of reading from the top level partition, to reduce the amount of metadata read during the query planning time. - +As of Drill 1.8, partition pruning also applies to the Parquet metadata cache. When data is partitioned in a directory hierarchy, Drill attempts to read the metadata cache file from a sub-partition, based on matching filter criteria instead of reading from the top level partition, to reduce the amount of metadata read during the query planning time. If you created a metadata cache file in a previous version of Drill, you must issue the REFRESH TABLE METADATA command to regenerate the metadata cache file before running queries for partition pruning to occur. See [Optimizing Parquet Metadata Reading]({{site.baseurl}}/docs/optimizing-parquet-metadata-reading/) for more information. The query planner in Drill performs partition pruning by evaluating the filters. If no partition filters are present, the underlying Scan operator reads all files in all directories and then sends the data to operators, such as Filter, downstream. When partition filters are present, the query planner pushes the filters down to the Scan if possible. The Scan reads only the directories that match the partition filters, thus reducing disk I/O.