Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id CF947200CDF for ; Thu, 17 Aug 2017 23:26:35 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id CDBFF16BD2C; Thu, 17 Aug 2017 21:26:35 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 4C33B16BD2B for ; Thu, 17 Aug 2017 23:26:34 +0200 (CEST) Received: (qmail 21569 invoked by uid 500); 17 Aug 2017 21:26:33 -0000 Mailing-List: contact commits-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: commits@drill.apache.org Delivered-To: mailing list commits@drill.apache.org Received: (qmail 21560 invoked by uid 99); 17 Aug 2017 21:26:33 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Aug 2017 21:26:33 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 60BA0DFC26; Thu, 17 Aug 2017 21:26:33 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: bridgetb@apache.org To: commits@drill.apache.org Message-Id: X-Mailer: ASF-Git Admin Mailer Subject: drill-site git commit: Additional doc updates related to hash agg spill to disk for 1.11 Date: Thu, 17 Aug 2017 21:26:33 +0000 (UTC) archived-at: Thu, 17 Aug 2017 21:26:36 -0000 Repository: drill-site Updated Branches: refs/heads/asf-site 53b591ef7 -> ecf68552c Additional doc updates related to hash agg spill to disk for 1.11 Project: http://git-wip-us.apache.org/repos/asf/drill-site/repo Commit: http://git-wip-us.apache.org/repos/asf/drill-site/commit/ecf68552 Tree: http://git-wip-us.apache.org/repos/asf/drill-site/tree/ecf68552 Diff: http://git-wip-us.apache.org/repos/asf/drill-site/diff/ecf68552 Branch: refs/heads/asf-site Commit: ecf68552c90c572648007de3908ffa0ef1e6eea1 Parents: 53b591e Author: Bridget Bevens Authored: Thu Aug 17 14:26:17 2017 -0700 Committer: Bridget Bevens Committed: Thu Aug 17 14:26:17 2017 -0700 ---------------------------------------------------------------------- docs/configuring-drill-memory/index.html | 27 +++++----- .../index.html | 50 +++++++++--------- docs/start-up-options/index.html | 53 ++++++++++---------- feed.xml | 4 +- 4 files changed, 68 insertions(+), 66 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/drill-site/blob/ecf68552/docs/configuring-drill-memory/index.html ---------------------------------------------------------------------- diff --git a/docs/configuring-drill-memory/index.html b/docs/configuring-drill-memory/index.html index ad23cf3..acf24cc 100644 --- a/docs/configuring-drill-memory/index.html +++ b/docs/configuring-drill-memory/index.html @@ -1126,30 +1126,31 @@ - Nov 1, 2016 + Aug 17, 2017
-

You can configure the amount of direct memory allocated to a Drillbit for query processing in any Drill cluster, multitenant or not. The default memory for a Drillbit is 8G, but Drill prefers 16G or more depending on the workload. The total amount of direct memory that a Drillbit allocates to query operations cannot exceed the limit set.

+

You can configure the amount of direct memory allocated to a Drillbit for query processing in any Drill cluster, multitenant or not. The default memory for a drillbit is 8G, but Drill prefers 16G or more depending on the workload. The total amount of direct memory that a drillbit allocates to query operations cannot exceed the limit set.

-

Drill uses Java direct memory and performs well when executing -operations in memory instead of storing the operations on disk. Drill does not -write to disk unless absolutely necessary, unlike MapReduce where everything -is written to disk during each phase of a job.

+

Drill uses Java direct memory and performs well when executing operations in memory instead of storing the operations on disk. Drill does not write to disk unless absolutely necessary, unlike MapReduce where everything is written to disk during each phase of a job.

The JVM’s heap memory does not limit the amount of direct memory available in -a Drillbit. The on-heap memory for Drill is typically set at 4-8G (default is 4), which should +a drillbit. The on-heap memory for Drill is typically set at 4-8G (default is 4), which should suffice because Drill avoids having data sit in heap memory.

-

As of Drill 1.5, Drill uses a new allocator that improves an operator’s use of direct memory and tracks the memory use more accurately. Due to this change, the sort operator (in queries that ran successfully in previous releases) may not have enough memory, resulting in a failed query and out of memory error instead of spilling to disk.

+

As of Drill 1.5, Drill uses a new allocator that improves an operator’s use of direct memory and tracks the memory use more accurately. Due to this change, the sort operator (in queries that ran successfully in previous releases) may not have enough memory, resulting in a failed query and out of memory error instead of spilling to disk.

-

The planner.memory.max_query_memory_per_node system option value sets the maximum amount of direct memory allocated to the sort operator in each query on a node. If a query plan contains multiple sort operators, they all share this memory. If you encounter memory issues when running queries with sort operators, increase the value of this option. If you continue to encounter memory issues after increasing this value, you can also reduce the value of the planner.width.max_per_node option to reduce the level of parallelism per node. However, this may increase the amount of time required for a query to complete.

+

Drillbit Memory

-

Modifying Drillbit Memory

+

The value set for the planner.memory.max_query_memory_per_node system option sets the maximum amount of direct memory allocated to the Sort and Hash Aggreate operators in each query on a node. If a query plan contains multiple Sort and/or Hash Aggregate operators, they all share this memory. If you encounter memory issues when running queries with Sort and/or Hash Aggregate operators, increase the value of this option. See Sort-Based and Hash-Based Memory Constrained Operators for more information.

-

You can modify memory for each Drillbit node in your cluster. To modify the memory for a Drillbit, set the DRILL_MAX_DIRECT_MEMORY variable in the Drillbit startup script, drill-env.sh, located in <drill_installation_directory>/conf, as follows:

+

If you continue to encounter memory issues after increasing this value, you can also reduce the value of the planner.width.max_per_node option to reduce the level of parallelism per node. However, this may increase the amount of time required for a query to complete.

+ +

Modifying Drillbit Memory

+ +

You can modify memory for each drillbit node in your cluster. To modify the memory for a drillbit, set the DRILL_MAX_DIRECT_MEMORY variable in the drillbit startup script, drill-env.sh, located in <drill_installation_directory>/conf, as follows:

export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"<value>"}
 
@@ -1157,9 +1158,9 @@ suffice because Drill avoids having data sit in heap memory.

If DRILL_MAX_DIRECT_MEMORY is not set, the limit depends on the amount of available system memory.

-

After you edit <drill_installation_directory>/conf/drill-env.sh, restart the Drillbit on the node.

+

After you edit <drill_installation_directory>/conf/drill-env.sh, restart the drillbit on the node.

-

About the Drillbit startup script

+

About the Drillbit Startup Script

The drill-env.sh file contains the following options:

#export DRILL_HEAP=${DRILL_HEAP:-"4G”}  

http://git-wip-us.apache.org/repos/asf/drill-site/blob/ecf68552/docs/sort-based-and-hash-based-memory-constrained-operators/index.html
----------------------------------------------------------------------
diff --git a/docs/sort-based-and-hash-based-memory-constrained-operators/index.html b/docs/sort-based-and-hash-based-memory-constrained-operators/index.html
index 84ae607..fd10d3b 100644
--- a/docs/sort-based-and-hash-based-memory-constrained-operators/index.html
+++ b/docs/sort-based-and-hash-based-memory-constrained-operators/index.html
@@ -1134,25 +1134,25 @@
 
     
-

Drill uses hash-based and sort-based operators depending on the query characteristics. Hash aggregation and hash join are hash-based operations. Streaming aggregation and merge join are sort-based operations. Both hash-based and sort-based operations consume memory, however the hash aggregate and hash join operators are the fastest and most memory intensive operators.

+

Drill uses hash-based and sort-based operators depending on the query characteristics. Hash Aggregate and Hash Join are hash-based operators. Sort, Streaming Aggregate, and Merge Join are sort-based operators. Both hash-based and sort-based operations consume memory, however the Hash Aggregate and Hash Join operators are the fastest and most memory intensive operators.

-

When planning a query with sort- and hash-based operators, Drill evaluates the available memory multiplied by a configurable reduction constant (for parallelization purposes) and then limits the operations to the maximum of this amount of memory. Drill spills data to disk if the sort and hash aggregate operations cannot be performed in memory. Alternatively, you can disable large hash operations if they do not fit in memory on your system. When disabled, Drill creates alternative plans. You can also modify the minimum hash table size, increasing the size for very large aggregations or joins when you have large amounts of memory for Drill to use. If you have large data sets, you can increase the hash table size to improve performance.

+

When planning a query with sort- and hash-based operations, Drill evaluates the available memory multiplied by a configurable reduction constant (for parallelization purposes) and then limits the operations to the maximum of this amount of memory. Drill spills data to disk if the sort and hash aggregate operations cannot be performed in memory. Alternatively, you can disable large hash operations if they do not fit in memory on your system. When disabled, Drill creates alternative plans. You can also modify the minimum hash table size, increasing the size for very large aggregations or joins when you have large amounts of memory for Drill to use. If you have large data sets, you can increase the hash table size to improve performance.

Memory Options

-

The planner.memory.max_query_memory_per_node option sets the maximum amount of direct memory allocated to the sort and hash aggregate operators during each query on a node. The default limit is 2147483648 bytes (2GB), which is quite conservative. This memory is split between operators. If a query plan contains multiple sort and/or hash aggregate operators, the memory is divided between them.

+

The planner.memory.max_query_memory_per_node option sets the maximum amount of direct memory allocated to the Sort and Hash Aggregate operators during each query on a node. The default limit is 2147483648 bytes (2GB), which is quite conservative. This memory is split between operators. If a query plan contains multiple Sort and/or Hash Aggregate operators, the memory is divided between them.

-

When a query is parallelized, the number of operators is multiplied, which reduces the amount of memory given to each instance of the sort and hash aggregate operators during a query. If you encounter memory issues when running queries with sort and hash aggregate operators, calculate the memory requirements for your queries and the amount of available memory on each node. Based on the information, increase the value for the planner.memory.max_query_memory_per_node option using the ALTER SYSTEM|SESSION SET command, as shown:

-
ALTER SYSTEM|SESSION SET `planner.memory.max_query_memory_per_node` = 8147483648  
+

When a query is parallelized, the number of operators is multiplied, which reduces the amount of memory given to each instance of the Sort and Hash Aggregate operators during a query. If you encounter memory issues when running queries with Sort and Hash Aggregate operators, calculate the memory requirements for your queries and the amount of available memory on each node. Based on the information, increase the value of the planner.memory.max_query_memory_per_node option using the ALTER SYSTEM|SESSION SET command, as shown:

+
ALTER SYSTEM|SESSION SET `planner.memory.max_query_memory_per_node` = <new_value>  
 
-

The planner.memory.enable_memory_estimation option toggles the state of memory estimation and re-planning of the query. When enabled, Drill conservatively estimates memory requirements and typically excludes memory-constrained operators from the query plan, which can negatively impact performance. The default setting is false. If you want Drill to use very conservative memory estimates, use the ALTER SYSTEM|SESSION SET command to change the setting, as shown:

+

The planner.memory.enable_memory_estimation option toggles the state of memory estimation and re-planning of a query. When enabled, Drill conservatively estimates memory requirements and typically excludes memory-constrained operators from the query plan, which can negatively impact performance. The default setting is false. If you want Drill to use very conservative memory estimates, use the ALTER SYSTEM|SESSION SET command to change the setting, as shown:

ALTER SYSTEM|SESSION SET `planner.memory.enable_memory_estimation` = true  
 

Spill to Disk

-

The "Spill to Disk" feature prevents queries that use memory-intensive sort and hash aggregate operations from failing with out-of-memory errors. Drill automatically writes excess data to a temporary directory on disk when queries with sort or hash aggregate operations exceed the set memory limit on a Drill node. When the operators finish processing the in-memory data, Drill reads the spilled data back from disk, and the operators finish processing the data. When the operations complete, Drill removes the data from disk.

+

Spilling data to disk prevents queries that use memory-intensive Sort and Hash Aggregate operations from failing with out-of-memory errors. Drill automatically writes excess data to a temporary directory on disk when queries with Sort or Hash Aggregate operations exceed the set memory limit on a Drill node. When the operators finish processing the in-memory data, Drill reads the spilled data back from disk, and the operators finish processing the data. When the operations complete, Drill removes the data from disk.

-

Spilling to disk enables queries to run uninterrupted while Drill performs the spill operations in the background. However, there can be performance impact due to the time required to spill data and then read the data back from disk.

+

Spilling data to disk enables queries to run uninterrupted while Drill performs the spill operations in the background. However, there can be performance impact due to the time required to spill data and then read the data back from disk.

Note

@@ -1173,7 +1173,7 @@

Spill to Disk Configuration Options

-

The spill to disk options reside in the drill-override.conf file on each Drill node. An administrator or someone familiar with storage and disks should manage these settings.

+

The options related to spilling reside in the drill-override.conf file on each Drill node. An administrator or someone familiar with storage and disks should manage these settings.

Note

@@ -1184,37 +1184,37 @@
  • drill.exe.spill.fs
    -Introduced in Drill 1.11. The default file system on the local machine into which the sort and hash aggregate operators spill data. This is the recommended option to use for spilling. You can configure this option so that data spills into a distributed file system, such as hdfs. For example, "hdfs:///". The default setting is "file:///".

  • +Introduced in Drill 1.11. The default file system on the local machine into which the Sort and Hash Aggregate operators spill data. This is the recommended option to use for spilling. You can configure this option so that data spills into a distributed file system, such as hdfs. For example, "hdfs:///". The default setting is "file:///".

  • drill.exec.spill.directories
    -Introduced in Drill 1.11. The list of directories into which the sort and hash aggregate operators spill data. The list must be an array with directories separated by a comma, for example ["/fs1/drill/spill" , "/fs2/drill/spill" , "/fs3/drill/spill"]. This is the recommended option for spilling to multiple directories. The default setting is ["/tmp/drill/spill"].

  • +Introduced in Drill 1.11. The list of directories into which the Sort and Hash Aggregate operators spill data. The list must be an array with directories separated by a comma, for example ["/fs1/drill/spill" , "/fs2/drill/spill" , "/fs3/drill/spill"]. This is the recommended option for spilling to multiple directories. The default setting is ["/tmp/drill/spill"].

  • drill.exec.sort.external.spill.fs
    -Overrides the default location into which the sort operator spills data. Instead of spilling into the location set by the drill.exec.spill.fs option, the sort operators spill into the location specified by this option.
    -Note: As of Drill 1.11, this option is supported for backward compatibility, however in future releases, this option will be deprecated. It is highly recommended that you use the drill.exec.spill.fs option to set the spill location instead. The default setting is "file:///".

  • +Overrides the default location into which the Sort operator spills data. Instead of spilling into the location set by the drill.exec.spill.fs option, the Sort operators spill into the location specified by this option.
    +Note: As of Drill 1.11, this option is supported for backward compatibility, however in future releases, this option will be deprecated. It is highly recommended that you use the drill.exec.spill.fs option to set the spill location instead. The default setting is "file:///".

  • drill.exec.sort.external.spill.directories
    -Overrides the location into which the sort operator spills data. Instead of spilling into the location set by the drill.exec.spill.directories option, the sort operators spill into the directories specified by this option. The list must be an array with directories separated by a comma, for example ["/fs1/drill/spill" , "/fs2/drill/spill" , "/fs3/drill/spill"].
    -Note: As of Drill 1.11, this option is supported for backward compatibility, however in future releases, this option will be deprecated. It is highly recommended that you use the drill.exec.spill.directories option to set the spill location instead. The default setting is ["/tmp/drill/spill"].

  • +Overrides the location into which the Sort operator spills data. Instead of spilling into the location set by the drill.exec.spill.directories option, the Sort operators spill into the directories specified by this option. The list must be an array with directories separated by a comma, for example ["/fs1/drill/spill" , "/fs2/drill/spill" , "/fs3/drill/spill"].
    +Note: As of Drill 1.11, this option is supported for backward compatibility, however in future releases, this option will be deprecated. It is highly recommended that you use the drill.exec.spill.directories option to set the spill location instead. The default setting is ["/tmp/drill/spill"].

  • drill.exec.hashagg.spill.fs
    -Overrides the location into which the hash aggregate operator spills data. Instead of spilling into the location set by the drill.exec.spill.fs option, the hash aggregate operator spills into the location specified by this option. Setting this option to 1 disables spilling for the hash aggregate operator.
    -Note: As of Drill 1.11, this option is supported for backward compatibility, however in future releases, this option will be deprecated. It is highly recommended that you use the drill.exec.spill.fs option to set the spill location instead. The default setting is "file:///".

  • +Overrides the location into which the Hash Aggregate operator spills data. Instead of spilling into the location set by the drill.exec.spill.fs option, the Hash Aggregate operator spills into the location specified by this option. Setting this option to 1 disables spilling for the Hash Aggregate operator.
    +Note: As of Drill 1.11, this option is supported for backward compatibility, however in future releases, this option will be deprecated. It is highly recommended that you use the drill.exec.spill.fs option to set the spill location instead. The default setting is "file:///".

  • drill.exec.hashagg.spill.directories
    -Overrides the location into which the hash aggregate operator spills data. Instead of spilling into the location set by the drill.exec.spill.directories option, the hash aggregate operator spills to the directories specified by this option. The list must be an array with directories separated by a comma, for example ["/fs1/drill/spill" , "/fs2/drill/spill" , "/fs3/drill/spill"].
    -Note: As of Drill 1.11, this option is supported for backward compatibility, however in future releases, this option will be deprecated. It is highly recommended that you use the drill.exec.spill. directories option to set the spill location instead.

  • +Overrides the location into which the Hash Aggregate operator spills data. Instead of spilling into the location set by the drill.exec.spill.directories option, the Hash Aggregate operator spills to the directories specified by this option. The list must be an array with directories separated by a comma, for example ["/fs1/drill/spill" , "/fs2/drill/spill" , "/fs3/drill/spill"].
    +Note: As of Drill 1.11, this option is supported for backward compatibility, however in future releases, this option will be deprecated. It is highly recommended that you use the drill.exec.spill.directories option to set the spill location instead.

-

Hash-Based Operator Settings

+

Hash-Based Operator Configuration Settings

-

Use the ALTER SYSTEM|SESSION SET commands with the options below to disable the hash aggregate and hash join operators, modify the hash table size, disable memory estimation, or set the estimated maximum amount of memory for a query. Typically, you set the options at the session level unless you want the setting to persist across all sessions.

+

Use the ALTER SYSTEM|SESSION SET commands with the options below to disable the Hash Aggregate and Hash Join operators, modify the hash table size, or disable memory estimation. Typically, you set the options at the session level unless you want the setting to persist across all sessions.

The following options control the hash-based operators:

  • planner.enable_hashagg
    -Enables or disables hash aggregation; otherwise, Drill does a sort-based aggregation. This option is enabled by default. The default setting is true, which is recommended.

  • +Enables or disables hash aggregation; otherwise, Drill does a sort-based aggregation. This option is enabled by default. The default, and recommended, setting is true. +The Hash Aggregate operator uses an uncontrolled amount of memory, up to 10 GB, after which the operator runs out of memory. As of Drill 1.11, the Hash Aggregate operator can write to disk.

  • planner.enable_hashjoin
    -Enables or disables the memory hungry hash join. Drill assumes that a query will have adequate memory to complete and tries to use the fastest operations possible to complete the planned inner, left, right, or full outer joins using a hash table. Currently, this operator does not write to disk. Disabling hash join allows Drill to manage arbitrarily large data in a small memory footprint. This option is enabled by default. The default setting is true.

  • +Enables or disables the memory hungry hash join. Drill assumes that a query will have adequate memory to complete and tries to use the fastest operations possible to complete the planned inner, left, right, or full outer joins using a hash table. The Hash Join operator uses an uncontrolled amount of memory, up to 10 GB, after which the operator runs out of memory. Currently, this operator does not write to disk. Disabling hash join allows Drill to manage arbitrarily large data in a small memory footprint. This option is enabled by default. The default setting is true.

  • exec.min_hash_table_size
    -Starting size for hash tables. Increase this setting based on the memory available to improve performance.
    -The default setting for this option is 65536. The setting can range from 0 to 1073741824.

  • +Starting size for hash tables. Increase this setting based on the memory available to improve performance. The default setting for this option is 65536. The setting can range from 0 to 1073741824.

  • exec.max_hash_table_size
    Ending size for hash tables. The default setting for this option is 1073741824. The setting can range from 0 to 1073741824.

http://git-wip-us.apache.org/repos/asf/drill-site/blob/ecf68552/docs/start-up-options/index.html ---------------------------------------------------------------------- diff --git a/docs/start-up-options/index.html b/docs/start-up-options/index.html index 1efbd32..39f1e67 100644 --- a/docs/start-up-options/index.html +++ b/docs/start-up-options/index.html @@ -1128,18 +1128,18 @@
- Aug 8, 2017 + Aug 17, 2017
-

Drill’s start-up options reside in a HOCON configuration file format, which is -a hybrid between a properties file and a JSON file. Drill start-up options -consist of a group of files with a nested relationship. At the bottom of the file hierarchy are the default files that Drill provides, starting with drill-default.conf. The drill-default.conf file is overridden by one or more drill-module.conf files that Drill’s internal modules provide. The drill-module.conf files are overridden by the drill-override.conf file that you define.

+

The start-up options for Drill reside in a HOCON configuration file format, which is a hybrid between a properties file and a JSON file. Drill start-up options consist of a group of files with a nested relationship. At the bottom of the file hierarchy are the default files that Drill provides, starting with drill-default.conf.

-

You can provide overrides on each Drillbit using system properties of the form -Dname=value passed on the command line:

-
   ./drillbit.sh start -Dname=value
+

The drill-default.conf file is overridden by one or more drill-module.conf files that Drill’s internal modules provide. The drill-module.conf files are overridden by the drill-override.conf file that you define.

+ +

You can provide overrides on each drillbit using system properties of the form -Dname=value passed on the command line:

+
./drillbit.sh start -Dname=value
 

You can see the following group of files throughout the source repository in Drill:

@@ -1151,44 +1151,45 @@ contrib/storage-hive/hive-exec-shade/src/main/resources/drill-module.conf exec/java-exec/src/main/resources/drill-module.conf distribution/src/resources/drill-override.conf
-

These files are listed inside the associated JAR files in the Drill -distribution tarball.

+

These files are listed inside the associated JAR files in the Drill distribution tarball.

Each Drill module has a set of options that Drill incorporates. Drill’s modular design enables you to create new storage plugins, set new operators, or create UDFs. You can also include additional configuration options that you -can override as necessary.

+can override as needed.

When you add a JAR file to Drill, you must include a drill-module.conf file in the root directory of the JAR file that you add. The drill-module.conf file tells Drill to scan that JAR file or associated object and include it.

-

Viewing Startup Options

+

Viewing Start-Up Options

-

You can run the following query to see a list of Drill’s startup options:

+

Run the following query to see a list of the available start-up options:

SELECT * FROM sys.boot;
 

Configuring Start-Up Options

-

You can configure start-up options for each Drillbit in <drill_home>/conf/drill-override.conf .

+

You can configure start-up options for each drillbit in <drill_home>/conf/drill-override.conf .

The summary of start-up options, also known as boot options, lists default values. The following descriptions provide more detail on key options that are frequently reconfigured:

    -
  • drill.exec.http.ssl_enabled
    -Available in Drill 1.2. Enables or disables HTTPS support. Settings are TRUE and FALSE, respectively. The default is FALSE.
  • -
  • drill.exec.sys.store.provider.class
    -Defines the persistent storage (PStore) provider. The PStore holds configuration and profile data.
  • -
  • drill.exec.buffer.size
    -Defines the amount of memory available, in terms of record batches, to hold data on the downstream side of an operation. Drill pushes data downstream as quickly as possible to make data immediately available. This requires Drill to use memory to hold the data pending operations. When data on a downstream operation is required, that data is immediately available so Drill does not have to go over the network to process it. Providing more memory to this option increases the speed at which Drill completes a query.
  • -
  • drill.exec.sort.external.spill.directories
    -Tells Drill which directory to use when spooling. Drill uses a spool and sort operation for beyond memory operations. The sorting operation is designed to spool to a Hadoop file system. The default Hadoop file system is a local file system in the /tmp directory. Spooling performance (both writing and reading back from it) is constrained by the file system.
  • -
  • drill.exec.zk.connect
    -Provides Drill with the ZooKeeper quorum to use to connect to data sources. Change this setting to point to the ZooKeeper quorum that you want Drill to use. You must configure this option on each Drillbit node.
  • -
  • drill.exec.profiles.store.inmemory
    -Available as of Drill 1.11. When set to TRUE, enables Drill to store query profiles in memory instead of writing the query profiles to disk. When set to FALSE, Drill writes the profile for each query to disk, which is either the local file system or a distributed file system, such as HDFS. For sub-second queries, writing the query profile to disk is expensive due to the interactions with the file system. Enable this option if you want Drill to store the profiles of sub-second queries in memory instead of writing them to disk. When you enable this option, Drill stores the profiles in memory for as long as the drillbit runs. When the drillbit restarts, the profiles no longer exist. You can set the maximum number of most recent profiles to retain in memory through the drill.exec.profiles.store.capacity option. Settings are TRUE and FALSE. Default is FALSE.
  • -
  • drill.exec.profiles.store.capacity
    -Available as of Drill 1.11. Sets the maximum number of most recent profiles to retain in memory when the drill.exec.profiles.store.inmemory option is enabled. Default is 1000.
  • +
  • drill.exec.http.ssl_enabled
    +Available in Drill 1.2. Enables or disables HTTPS support. Settings are TRUE and FALSE, respectively. The default is FALSE.

  • +
  • drill.exec.sys.store.provider.class
    +Defines the persistent storage (PStore) provider. The PStore holds configuration and profile data.

  • +
  • drill.exec.buffer.size
    +Defines the amount of memory available, in terms of record batches, to hold data on the downstream side of an operation. Drill pushes data downstream as quickly as possible to make data immediately available. This requires Drill to use memory to hold the data pending operations. When data on a downstream operation is required, that data is immediately available so Drill does not have to go over the network to process it. Providing more memory to this option increases the speed at which Drill completes a query.

  • +
  • drill.exe.spill.fs
    +Introduced in Drill 1.11. The default file system on the local machine into which the Sort and Hash Aggregate operators spill data. This is the recommended option to use for spilling. You can configure this option so that data spills into a distributed file system, such as hdfs. For example, "hdfs:///". The default setting is "file:///". See Sort-Based and Hash-Based Memory Constrained Operators for more information.

  • +
  • drill.exec.spill.directories
    +Introduced in Drill 1.11. The list of directories into which the Sort and Hash Aggregate operators spill data. The list must be an array with directories separated by a comma, for example ["/fs1/drill/spill" , "/fs2/drill/spill" , "/fs3/drill/spill"]. This is the recommended option for spilling to multiple directories. The default setting is ["/tmp/drill/spill"]. See Sort-Based and Hash-Based Memory Constrained Operators for more information.

  • +
  • drill.exec.zk.connect
    +Provides Drill with the ZooKeeper quorum to use to connect to data sources. Change this setting to point to the ZooKeeper quorum that you want Drill to use. You must configure this option on each Drillbit node.

  • +
  • drill.exec.profiles.store.inmemory
    +Available as of Drill 1.11. When set to TRUE, enables Drill to store query profiles in memory instead of writing the query profiles to disk. When set to FALSE, Drill writes the profile for each query to disk, which is either the local file system or a distributed file system, such as HDFS. For sub-second queries, writing the query profile to disk is expensive due to the interactions with the file system. Enable this option if you want Drill to store the profiles of sub-second queries in memory instead of writing them to disk. When you enable this option, Drill stores the profiles in memory for as long as the drillbit runs. When the drillbit restarts, the profiles no longer exist. You can set the maximum number of most recent profiles to retain in memory through the drill.exec.profiles.store.capacity option. Settings are TRUE and FALSE. Default is FALSE. See Persistent Configuration Storage for more information.

  • +
  • drill.exec.profiles.store.capacity
    +Available as of Drill 1.11. Sets the maximum number of most recent profiles to retain in memory when the drill.exec.profiles.store.inmemory option is enabled. Default is 1000.

http://git-wip-us.apache.org/repos/asf/drill-site/blob/ecf68552/feed.xml ---------------------------------------------------------------------- diff --git a/feed.xml b/feed.xml index b7bcc10..b0eecf5 100644 --- a/feed.xml +++ b/feed.xml @@ -6,8 +6,8 @@ / - Thu, 17 Aug 2017 12:03:10 -0700 - Thu, 17 Aug 2017 12:03:10 -0700 + Thu, 17 Aug 2017 14:23:20 -0700 + Thu, 17 Aug 2017 14:23:20 -0700 Jekyll v2.5.2