drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject [5/6] drill git commit: add decimal disabled note per VM
Date Fri, 07 Aug 2015 22:53:45 GMT
add decimal disabled note per VM

DRILL-3607

1.2 updates (hidden)


Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/7a3a8eb3
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/7a3a8eb3
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/7a3a8eb3

Branch: refs/heads/gh-pages
Commit: 7a3a8eb35a67ccccde1ec8f57405a98fa622c3ae
Parents: 2a0ac3c
Author: Kristine Hahn <khahn@maprtech.com>
Authored: Fri Aug 7 10:58:46 2015 -0700
Committer: Kristine Hahn <khahn@maprtech.com>
Committed: Fri Aug 7 14:50:59 2015 -0700

----------------------------------------------------------------------
 .../060-configuring-a-shared-drillbit.md             |  6 +++---
 .../020-querying-parquet-files.md                    | 15 +++++++++++++--
 .../050-aggregate-and-aggregate-statistical.md       |  2 +-
 3 files changed, 17 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill/blob/7a3a8eb3/_docs/configure-drill/060-configuring-a-shared-drillbit.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/060-configuring-a-shared-drillbit.md b/_docs/configure-drill/060-configuring-a-shared-drillbit.md
index 2d31cef..3ea0f2f 100644
--- a/_docs/configure-drill/060-configuring-a-shared-drillbit.md
+++ b/_docs/configure-drill/060-configuring-a-shared-drillbit.md
@@ -24,13 +24,13 @@ By default, Drill parallelizes operations when number of records manipulated
wit
 
 To configure parallelization, configure the following options in the `sys.options` table:
 
-* `planner.width.max.per.node`  
+* `planner.width.max_per_node`  
   The maximum degree of distribution of a query across cores and cluster nodes.
-* `planner.width.max.per.query`  
+* `planner.width.max_per_query`  
   Same as max per node but applies to the query as executed by the entire cluster.
 
 ### planner.width.max_per_node
-Configure the `planner.width.max.per.node` to achieve fine grained, absolute control over
parallelization. In this context *width* refers to fanout or distribution potential: the ability
to run a query in parallel across the cores on a node and the nodes on a cluster. A physical
plan consists of intermediate operations, known as query &quot;fragments,&quot; that
run concurrently, yielding opportunities for parallelism above and below each exchange operator
in the plan. An exchange operator represents a breakpoint in the execution flow where processing
can be distributed. For example, a single-process scan of a file may flow into an exchange
operator, followed by a multi-process aggregation fragment.
+Configure the `planner.width.max_per_node` to achieve fine grained, absolute control over
parallelization. In this context *width* refers to fanout or distribution potential: the ability
to run a query in parallel across the cores on a node and the nodes on a cluster. A physical
plan consists of intermediate operations, known as query &quot;fragments,&quot; that
run concurrently, yielding opportunities for parallelism above and below each exchange operator
in the plan. An exchange operator represents a breakpoint in the execution flow where processing
can be distributed. For example, a single-process scan of a file may flow into an exchange
operator, followed by a multi-process aggregation fragment.
 
 The maximum width per node defines the maximum degree of parallelism for any fragment of
a query, but the setting applies at the level of a single node in the cluster. The *default*
maximum degree of parallelism per node is calculated as follows, with the theoretical maximum
automatically scaled back (and rounded down) so that only 70% of the actual available capacity
is taken into account: number of active drillbits (typically one per node) * number of cores
per node * 0.7
 

http://git-wip-us.apache.org/repos/asf/drill/blob/7a3a8eb3/_docs/query-data/query-a-file-system/020-querying-parquet-files.md
----------------------------------------------------------------------
diff --git a/_docs/query-data/query-a-file-system/020-querying-parquet-files.md b/_docs/query-data/query-a-file-system/020-querying-parquet-files.md
index c8ecb65..da029af 100644
--- a/_docs/query-data/query-a-file-system/020-querying-parquet-files.md
+++ b/_docs/query-data/query-a-file-system/020-querying-parquet-files.md
@@ -2,11 +2,22 @@
 title: "Querying Parquet Files"
 parent: "Querying a File System"
 ---
-Your Drill installation includes a `sample-data` directory with Parquet files
+
+<!-- Drill 1.2 extends SQL for performant querying of Parquet files. By including a command
in a query to cache Parquet file metadata, you trigger the generation of metadata files in
the directory of Parquet files and its subdirectories. 
+
+You need to include the command in only the first query of a file or directory. Subsequent
can queries return results quickly because Drill refers to the cached metadata. Drill updates
metadata automatically when the Parquet files change by comparing timestamps of data. 
+
+To generate metadata, use the following command:
+
+## Example of Generating Parquet Metadata
+
+ -->
+## Sample Parquet Files
+The Drill installation includes a `sample-data` directory with Parquet files
 that you can query. Use SQL syntax to query the `region.parquet` and
 `nation.parquet` files in the `sample-data` directory.
 
-{% include startnote.html %}Your Drill installation location may differ from the examples
used here.{% include endnote.html %} 
+{% include startnote.html %}The Drill installation location may differ from the examples
used here.{% include endnote.html %} 
 
 The examples assume that Drill was [installed in embedded mode]({{ site.baseurl }}/docs/installing-drill-in-embedded-mode).
If you installed Drill in distributed mode, or your `sample-data` directory differs from the
location used in the examples. Change the `sample-data` directory to the correct location
before you run the queries.
 

http://git-wip-us.apache.org/repos/asf/drill/blob/7a3a8eb3/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
----------------------------------------------------------------------
diff --git a/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
b/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
index 4fc886b..ce71e8a 100644
--- a/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
+++ b/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
@@ -250,4 +250,4 @@ Drill provides following aggregate statistics functions:
 * var_samp(expression)
   Sample variance of input values (sample standard deviation squared)
   
-These functions take a SMALLINT, INTEGER, BIGINT, FLOAT, DOUBLE, or DECIMAL expression as
the argument. If the expression is FLOAT, the function returns  DOUBLE; otherwise, the function
returns DECIMAL.
+These functions take a SMALLINT, INTEGER, BIGINT, FLOAT, DOUBLE, or DECIMAL expression as
the argument. If the expression is FLOAT, the function returns  DOUBLE; otherwise, the function
returns DECIMAL. As previously mentioned, DECIMAL is disabled. You can [enable the DECIMAL
type](docs/supported-data-types/#enabling-the-decimal-type), but this is not recommended.


Mime
View raw message