drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject [1/6] drill git commit: DRILL-3321
Date Wed, 24 Jun 2015 00:16:21 GMT
Repository: drill
Updated Branches:
  refs/heads/gh-pages 3d1c00554 -> 957c5d868


DRILL-3321

editorial

broken links

1.1 features


Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/cc329855
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/cc329855
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/cc329855

Branch: refs/heads/gh-pages
Commit: cc329855f9e955f2041d2b50b3a7264ede884ba6
Parents: 3d1c005
Author: Kristine Hahn <khahn@maprtech.com>
Authored: Mon Jun 22 18:13:20 2015 -0700
Committer: Kristine Hahn <khahn@maprtech.com>
Committed: Mon Jun 22 18:17:16 2015 -0700

----------------------------------------------------------------------
 .../010-configure-drill-introduction.md         | 24 +++++++++++++++---
 .../040-persistent-configuration-storage.md     | 21 +++++++++-------
 _docs/connect-a-data-source/050-workspaces.md   | 21 +++++++++++++---
 .../015-using-jdbc-driver.md                    |  2 +-
 .../065-query-directory-functions.md            |  2 +-
 .../data-types/020-date-time-and-timestamp.md   |  4 +--
 .../sql-commands/035-partition-by-clause.md     | 26 +++++++++-----------
 7 files changed, 67 insertions(+), 33 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill/blob/cc329855/_docs/configure-drill/010-configure-drill-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/010-configure-drill-introduction.md b/_docs/configure-drill/010-configure-drill-introduction.md
index 62f17f3..42b1f50 100644
--- a/_docs/configure-drill/010-configure-drill-introduction.md
+++ b/_docs/configure-drill/010-configure-drill-introduction.md
@@ -2,9 +2,27 @@
 title: "Configure Drill Introduction"
 parent: "Configure Drill"
 ---
-When using Drill, you need to make sufficient memory available Drill when running Drill alone
or along side other workloads on the cluster. The next section, ["Configuring Drill Memory"]({{site.baseurl}}/docs/configuring-drill-memory)
describes how to configure memory for a Drill cluster. Configuring other resources for [multitenancy
clusters]({{site.baseurl}}/docs/configuring-multitenant-resources) or for [sharing a Drillbit]({{site.baseurl}}/docs/configuring-a-shared-drillbit)
on a cluster is covered later.
+
+This section briefly describes the following key Drill configuration tasks and provides links
to configuration procedures:
+
+* Memory Configuration
+* Multitenancy Configuration
+* Performance and Functionality Configuration
+* Query Profile Data Storage Configuration 
+
+## Memory Configuration
+
+When using Drill, you need to make sufficient memory available Drill when running Drill alone
or along side other workloads on the cluster. The next section, ["Configuring Drill Memory"]({{site.baseurl}}/docs/configuring-drill-memory)
describes how to configure memory for a Drill cluster. 
+
+## Multitenancy Configuration
+
+You can configure resources for [multitenancy clusters]({{site.baseurl}}/docs/configuring-multitenant-resources)
or for [sharing a Drillbit]({{site.baseurl}}/docs/configuring-a-shared-drillbit) on a cluster.
+
+## Performance and Functionality Configuration
 
 You can also modify options for performance or functionality. For example, changing the default
storage format is a typical functional change. The default storage format for CTAS
-statements is Parquet. Using a configuration option, you can modify Drill to store the output
data in CSV or JSON format. 
+statements is Parquet. Using a configuration option, you can modify Drill to store the output
data in CSV or JSON format. The section, ["Configuration Options Introduction"]({{site.baseurl}}/docs/configuration-options-introduction)
summarizes the many options you can configure. 
+
+## Query Profile Data Storage Configuration
 
-The section, ["Configuration Options Introduction"]({{site.baseurl}}/docs/configuration-options-introduction)
summarizes the many options you can configure. 
+To enjoy a problem-free Drill Web UI experience, you need to [configure the ZooKeeper PStore]({{site.baseurl}}/docs/persistent-configuration-storage/#configuring-the-zookeeper-pstore).

http://git-wip-us.apache.org/repos/asf/drill/blob/cc329855/_docs/configure-drill/configuration-options/040-persistent-configuration-storage.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/configuration-options/040-persistent-configuration-storage.md
b/_docs/configure-drill/configuration-options/040-persistent-configuration-storage.md
index 053f25b..f23a9d9 100644
--- a/_docs/configure-drill/configuration-options/040-persistent-configuration-storage.md
+++ b/_docs/configure-drill/configuration-options/040-persistent-configuration-storage.md
@@ -19,20 +19,23 @@ modes:
   
 {% include startnote.html %}Switching between storage modes does not migrate configuration
data.{% include endnote.html %}
 
-## ZooKeeper for Persistent Configuration Storage
+## Configuring ZooKeeper PStore
 
-To make Drill installation and configuration simple, Drill uses ZooKeeper to
+Drill uses ZooKeeper to
 store persistent configuration data. The ZooKeeper PStore provider stores all
 of the persistent configuration data in ZooKeeper except for query profile
-data.
+data. The ZooKeeper PStore provider offloads query profile data to the
+${DRILL_LOG_DIR:-/var/log/drill} directory on Drill nodes. 
 
-The ZooKeeper PStore provider offloads query profile data to the
-${DRILL_LOG_DIR:-/var/log/drill} directory on Drill nodes. If you want the
-query profile data stored in a specific location, you can configure where
-ZooKeeper offloads the data.
+You need to configure the ZooKeeper PStore to use the Drill Web UI when running multiple
Drillbits. 
 
-To modify where the ZooKeeper PStore provider offloads query profile data,
-configure the `sys.store.provider.zk.blobroot` property in the `drill.exec`
+## Why Configure the ZooKeeper PStore
+
+When you run multiple DrillBits, configure a specific location for ZooKeeper to offload the
query profile data instead of accepting the default temporary location. All Drillbits in the
cluster cannot access the temporary location. Consequently, when you do not configure a location
on the distributed file system, queries sent to do some Drillbits do not appear in the Completed
section of the Drill Web UI. Also, some Running links that you click to get information about
the running queries are broken links.
+
+## Configuring the ZooKeeper PStore
+
+To configure the ZooKeeper PStore, set the `sys.store.provider.zk.blobroot` property in the
`drill.exec`
 block in `<drill_installation_directory>/conf/drill-override.conf` on each
 Drill node and then restart the Drillbit service.
 

http://git-wip-us.apache.org/repos/asf/drill/blob/cc329855/_docs/connect-a-data-source/050-workspaces.md
----------------------------------------------------------------------
diff --git a/_docs/connect-a-data-source/050-workspaces.md b/_docs/connect-a-data-source/050-workspaces.md
index 361bfec..b1156c9 100644
--- a/_docs/connect-a-data-source/050-workspaces.md
+++ b/_docs/connect-a-data-source/050-workspaces.md
@@ -3,9 +3,24 @@ title: "Workspaces"
 parent: "Storage Plugin Configuration"
 ---
 When you register an instance of a file system data source, you can configure
-one or more workspaces for the instance. The workspace defines the default directory location
of files in a local or distributed file system. The `default`
-workspace points to the root of the file system. Drill searches the workspace to locate data
when
-you run a query.
+one or more workspaces for the instance. The workspace defines the  directory location of
files in a local or distributed file system. Drill searches the workspace to locate data when
+you run a query. The `default`
+workspace points to the root of the file system. 
+
+Configuring `workspaces` in the storage plugin definition to include the file location simplifies
the query, which is important when querying the same data source repeatedly. After you configure
a long path name in the workspaces location property, instead of
+using the full path to the data source, you use dot notation in the FROM
+clause.
+
+``<workspaces>.`<location>```
+
+To query the data source while you are _not_ connected to
+that storage plugin, include the plugin name. This syntax assumes you did not issue a USE
statement to connect to a storage plugin that defines the
+location of the data:
+
+``<plugin>.<workspaces>.`<location>```
+
+
+## No Workspaces for Hive and HBase
 
 You cannot create workspaces for
 `hive` and `hbase` storage plugins, though Hive databases show up as workspaces in

http://git-wip-us.apache.org/repos/asf/drill/blob/cc329855/_docs/odbc-jdbc-interfaces/015-using-jdbc-driver.md
----------------------------------------------------------------------
diff --git a/_docs/odbc-jdbc-interfaces/015-using-jdbc-driver.md b/_docs/odbc-jdbc-interfaces/015-using-jdbc-driver.md
index 9f471eb..4bf05e7 100755
--- a/_docs/odbc-jdbc-interfaces/015-using-jdbc-driver.md
+++ b/_docs/odbc-jdbc-interfaces/015-using-jdbc-driver.md
@@ -2,7 +2,7 @@
 title: "Using the JDBC Driver"
 parent: "ODBC/JDBC Interfaces"
 ---
-This section explains how to install and use the JDBC driver for Apache Drill. For specific
examples of client tool connections to Drill via JDBC, see [Using JDBC with SQuirreL]({{ site.baseurl
}}/docs/.../) and [Configuring Spotfire Server]({{ site.baseurl }}/docs/.../).
+This section explains how to install and use the JDBC driver for Apache Drill. For specific
examples of client tool connections to Drill via JDBC, see [Using JDBC with SQuirreL]({{ site.baseurl
}}/docs/using-jdbc-with-squirrel-on-windows) and [Configuring Spotfire Server]({{ site.baseurl
}}/docs/configuring-tibco-spotfire-server-with-drill/).
 
 
 ### Prerequisites

http://git-wip-us.apache.org/repos/asf/drill/blob/cc329855/_docs/sql-reference/065-query-directory-functions.md
----------------------------------------------------------------------
diff --git a/_docs/sql-reference/065-query-directory-functions.md b/_docs/sql-reference/065-query-directory-functions.md
index 6e6802b..feb9a05 100644
--- a/_docs/sql-reference/065-query-directory-functions.md
+++ b/_docs/sql-reference/065-query-directory-functions.md
@@ -22,7 +22,7 @@ The following syntax shows how to construct a SELECT statement that using
the MA
     SELECT * FROM <plugin>.<workspace>.`<filename>` 
     WHERE dir<n> = MAXDIR('<plugin>.<workspace>', '<filename>');
 
-Enclose both arguments to the query directory function in single-quotation marks, not backticks.
The first argument to the function is the plugin and workspace names in dot notation, and
the second argument is the directory name. The dir<n> variable, `dir0`, `dir1`, and
so on, refers to
+Enclose both arguments to the query directory function in single-quotation marks, not back
ticks. The first argument to the function is the plugin and workspace names in dot notation,
and the second argument is the directory name. The dir<n> variable, `dir0`, `dir1`,
and so on, refers to
 subdirectories in your workspace path, as explained in section, ["Querying Directories"]({{site.baseurl}}/docs/querying-directories).

 
 ## Query Directory Function Example 

http://git-wip-us.apache.org/repos/asf/drill/blob/cc329855/_docs/sql-reference/data-types/020-date-time-and-timestamp.md
----------------------------------------------------------------------
diff --git a/_docs/sql-reference/data-types/020-date-time-and-timestamp.md b/_docs/sql-reference/data-types/020-date-time-and-timestamp.md
index 8683aa0..60d997f 100644
--- a/_docs/sql-reference/data-types/020-date-time-and-timestamp.md
+++ b/_docs/sql-reference/data-types/020-date-time-and-timestamp.md
@@ -63,9 +63,9 @@ When you want to use interval data in input, use INTERVAL as a keyword that
intr
 
 To cast interval data to interval types you can query from a data source such as JSON, see
the example in the section, ["Casting Intervals"]({{site.baseurl}}/docs/data-type-conversion/#casting-intervals).
 
-### Literal Interval Exampls
+### Literal Interval Examples
 
-In the following example, the INTERVAL keyword followed by 200 adds 200 years to the timestamp.
The parentheticated 3 in `YEAR(3)` specifies the precision of the year interval, 3 digits
in this case to support the hundreds interval.
+In the following example, the INTERVAL keyword followed by 200 adds 200 years to the timestamp.
The 3 in parentheses in `YEAR(3)` specifies the precision of the year interval, 3 digits in
this case to support the hundreds interval.
 
     SELECT CURRENT_TIMESTAMP + INTERVAL '200' YEAR(3) FROM sys.version;
     +--------------------------+

http://git-wip-us.apache.org/repos/asf/drill/blob/cc329855/_docs/sql-reference/sql-commands/035-partition-by-clause.md
----------------------------------------------------------------------
diff --git a/_docs/sql-reference/sql-commands/035-partition-by-clause.md b/_docs/sql-reference/sql-commands/035-partition-by-clause.md
index b34e5e7..8208663 100644
--- a/_docs/sql-reference/sql-commands/035-partition-by-clause.md
+++ b/_docs/sql-reference/sql-commands/035-partition-by-clause.md
@@ -2,36 +2,34 @@
 title: "PARTITION BY Clause"
 parent: "SQL Commands"
 ---
-You can take advantage of automatic partitioning in Drill 1.1 using the PARTITION BY CLAUSE
in the CTAS command:
+You can take advantage of automatic partitioning in Drill 1.1 by using the PARTITION BY clause
in the CTAS command:
+
+## Syntax
 
 	CREATE TABLE table_name [ (column_name, . . .) ] 
     [ PARTITION_BY (column_name, . . .) ] 
     AS SELECT_statement;
 
-The CTAS statement that uses the PARTITION BY clause must store the data in Parquet format.
The CTAS statement needs to meet one of the following requirements:
-
-* The column list in the PARTITION by clause are included in the column list following the
table_name
-* The SELECT statement has to use a * column if the base table in the SELECT statement is
schema-less, and when the partition column is resolved to * column in a schema-less query,
this * column cannot be a result of a join operation. 
+The CTAS statement that uses the PARTITION BY clause must store the data in Parquet format
and meet one of the following requirements:
 
+* The columns in the column list in the PARTITION BY clause are included in the column list
following the table_name
+* The SELECT statement has to use a * column (SELECT *) if the base table in the SELECT statement
is schema-less, and when the partition column is resolved to a * column in a schema-less query,
this * column cannot be a result of a join operation. 
 
-To create and verify the contents of a table that contains this row:
+The output of using the PARTITION BY clause creates separate files. Each file contains one
partition value, and Drill can create multiple files for the same partition value.
 
-  1. Set the workspace to a writable workspace.
-  2. Set the `store.format` option to Parquet
-  3. Run a CTAS statement with the PARTITION BY clause.
-  4. Go to the directory where the table is stored and check the contents of the file.
-  5. Run a query against the new table.
+Partition pruning uses the parquet column stats to determine which which columns can be used
to prune.
 
 Examples:
 
+    USE cp;
 	CREATE TABLE mytable1 PARTITION BY (r_regionkey) AS 
-	  SELECT r_regionkey, r_name FROM cp.`tpch/region.parquet`
+	  SELECT r_regionkey, r_name FROM cp.`tpch/region.parquet`;
 	CREATE TABLE mytable2 PARTITION BY (r_regionkey) AS 
-	  SELECT * FROM cp.`tpch/region.parquet`
+	  SELECT * FROM cp.`tpch/region.parquet`;
 	CREATE TABLE mytable3 PARTITION BY (r_regionkey) AS
 	  SELECT r.r_regionkey, r.r_name, n.n_nationkey, n.n_name 
 	  FROM cp.`tpch/nation.parquet` n, cp.`tpch/region.parquet` r
-	  WHERE n.n_regionkey = r.r_regionkey
+	  WHERE n.n_regionkey = r.r_regionkey;
 
 
 


Mime
View raw message