Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id E8272200CDF for ; Thu, 17 Aug 2017 23:52:28 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id E66AA16BE46; Thu, 17 Aug 2017 21:52:28 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8F5CF16BE43 for ; Thu, 17 Aug 2017 23:52:27 +0200 (CEST) Received: (qmail 73321 invoked by uid 500); 17 Aug 2017 21:52:25 -0000 Mailing-List: contact commits-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: commits@drill.apache.org Delivered-To: mailing list commits@drill.apache.org Received: (qmail 73312 invoked by uid 99); 17 Aug 2017 21:52:25 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Aug 2017 21:52:25 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 88DF8DFC26; Thu, 17 Aug 2017 21:52:23 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: bridgetb@apache.org To: commits@drill.apache.org Message-Id: <1754b5d97d6148b5b0661862ec342979@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: drill-site git commit: edits to config options doc per 1.11 updates Date: Thu, 17 Aug 2017 21:52:23 +0000 (UTC) archived-at: Thu, 17 Aug 2017 21:52:29 -0000 Repository: drill-site Updated Branches: refs/heads/asf-site ecf68552c -> 7ecab1e6e edits to config options doc per 1.11 updates Project: http://git-wip-us.apache.org/repos/asf/drill-site/repo Commit: http://git-wip-us.apache.org/repos/asf/drill-site/commit/7ecab1e6 Tree: http://git-wip-us.apache.org/repos/asf/drill-site/tree/7ecab1e6 Diff: http://git-wip-us.apache.org/repos/asf/drill-site/diff/7ecab1e6 Branch: refs/heads/asf-site Commit: 7ecab1e6ef4d1378698b462b06bca945b9cc41b5 Parents: ecf6855 Author: Bridget Bevens Authored: Thu Aug 17 14:52:09 2017 -0700 Committer: Bridget Bevens Committed: Thu Aug 17 14:52:09 2017 -0700 ---------------------------------------------------------------------- .../index.html | 168 +++++++++---------- feed.xml | 4 +- 2 files changed, 86 insertions(+), 86 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/drill-site/blob/7ecab1e6/docs/configuration-options-introduction/index.html ---------------------------------------------------------------------- diff --git a/docs/configuration-options-introduction/index.html b/docs/configuration-options-introduction/index.html index 77fad7e..2808872 100644 --- a/docs/configuration-options-introduction/index.html +++ b/docs/configuration-options-introduction/index.html @@ -1128,7 +1128,7 @@ - Aug 7, 2017 + Aug 17, 2017 @@ -1147,7 +1147,7 @@

System Options

-

The sys.options table lists the following options that you can set as a system or session option as described in the section, "Planning and Execution Options".

+

The sys.options table lists ptions that you can set at the system or session level, as described in the section, "Planning and Execution Options".

@@ -1159,372 +1159,372 @@ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - - + + - + - - + + - + - + - + - + - + - + - - + + - + - + - + - + - + - + - + - + - + - + - + - + - + @@ -1534,27 +1534,27 @@ - + - + - + - + - +
drill.exec.default_temporary_workspace dfs.tmpAvailable as of Drill 1.10. Sets the workspace for temporary tables. The workspace must be writable, file-based, and point to a location that already exists. This option requires the following format: .<workspaceAvailable as of Drill 1.10. Sets the workspace for temporary tables. The workspace must be writable, file-based, and point to a location that already exists. This option requires the following format: .<workspace
drill.exec.storage.implicit.filename.column.label filenameAvailable as of Drill 1.10. Sets the implicit column name for the filename column.Available as of Drill 1.10. Sets the implicit column name for the filename column.
drill.exec.storage.implicit.filepath.column.label filepathAvailable as of Drill 1.10. Sets the implicit column name for the filepath column.Available as of Drill 1.10. Sets the implicit column name for the filepath column.
drill.exec.storage.implicit.fqn.column.label fqnAvailable as of Drill 1.10. Sets the implicit column name for the fqn column.Available as of Drill 1.10. Sets the implicit column name for the fqn column.
drill.exec.storage.implicit.suffix.column.label suffixAvailable as of Drill 1.10. Sets the implicit column name for the suffix column.Available as of Drill 1.10. Sets the implicit column name for the suffix column.
drill.exec.functions.cast_empty_string_to_null FALSEIn a text file, treat empty fields as NULL values instead of empty string.In a text file, treat empty fields as NULL values instead of empty string.
drill.exec.storage.file.partition.column.label dirThe column label for directory levels in results of queries of files in a directory. Accepts a string input.The column label for directory levels in results of queries of files in a directory. Accepts a string input.
exec.enable_union_type FALSEEnable support for Avro union type.Enable support for Avro union type.
exec.errors.verbose FALSEToggles verbose output of executable error messagesToggles verbose output of executable error messages
exec.java_compiler DEFAULTSwitches between DEFAULT, JDK, and JANINO mode for the current session. Uses Janino by default for generated source code of less than exec.java_compiler_janino_maxsize; otherwise, switches to the JDK compiler.Switches between DEFAULT, JDK, and JANINO mode for the current session. Uses Janino by default for generated source code of less than exec.java_compiler_janino_maxsize; otherwise, switches to the JDK compiler.
exec.java_compiler_debug TRUEToggles the output of debug-level compiler error messages in runtime generated code.Toggles the output of debug-level compiler error messages in runtime generated code.
exec.java_compiler_janino_maxsize 262144See the exec.java_compiler option comment. Accepts inputs of type LONG.See the exec.java_compiler option comment. Accepts inputs of type LONG.
exec.max_hash_table_size 1073741824Ending size in buckets for hash tables. Range: 0 - 1073741824.Ending size in buckets for hash tables. Range: 0 - 1073741824.
exec.min_hash_table_size 65536Starting size in bucketsfor hash tables. Increase according to available memory to improve performance. Increasing for very large aggregations or joins when you have large amounts of memory for Drill to use. Range: 0 - 1073741824.Starting size in bucketsfor hash tables. Increase according to available memory to improve performance. Increasing for very large aggregations or joins when you have large amounts of memory for Drill to use. Range: 0 - 1073741824.
exec.queue.enable FALSEChanges the state of query queues. False allows unlimited concurrent queries.Changes the state of query queues. False allows unlimited concurrent queries.
exec.queue.large 10Sets the number of large queries that can run concurrently in the cluster. Range: 0-1000Sets the number of large queries that can run concurrently in the cluster. Range: 0-1000
exec.queue.small 100Sets the number of small queries that can run concurrently in the cluster. Range: 0-1001Sets the number of small queries that can run concurrently in the cluster. Range: 0-1001
exec.queue.threshold 30000000Sets the cost threshold, which depends on the complexity of the queries in queue, for determining whether query is large or small. Complex queries have higher thresholds. Range: 0-9223372036854775807Sets the cost threshold, which depends on the complexity of the queries in queue, for determining whether query is large or small. Complex queries have higher thresholds. Range: 0-9223372036854775807
exec.queue.timeout_millis 300000Indicates how long a query can wait in queue before the query fails. Range: 0-9223372036854775807Indicates how long a query can wait in queue before the query fails. Range: 0-9223372036854775807
exec.schedule.assignment.old FALSEUsed to prevent query failure when no work units are assigned to a minor fragment, particularly when the number of files is much larger than the number of leaf fragments.Used to prevent query failure when no work units are assigned to a minor fragment, particularly when the number of files is much larger than the number of leaf fragments.
exec.storage.enable_new_text_reader TRUEEnables the text reader that complies with the RFC 4180 standard for text/csv files.Enables the text reader that complies with the RFC 4180 standard for text/csv files.
new_view_default_permissions 700Sets view permissions using an octal code in the Unix tradition.Sets view permissions using an octal code in the Unix tradition.
planner.add_producer_consumer FALSEIncrease prefetching of data from disk. Disable for in-memory reads.Increase prefetching of data from disk. Disable for in-memory reads.
planner.affinity_factor 1.2Factor by which a node with endpoint affinity is favored while creating assignment. Accepts inputs of type DOUBLE.Factor by which a node with endpoint affinity is favored while creating assignment. Accepts inputs of type DOUBLE.
planner.broadcast_factor 1A heuristic parameter for influencing the broadcast of records as part of a query.A heuristic parameter for influencing the broadcast of records as part of a query.
planner.broadcast_threshold 10000000The maximum number of records allowed to be broadcast as part of a query. After one million records, Drill reshuffles data rather than doing a broadcast to one side of the join. Range: 0-2147483647The maximum number of records allowed to be broadcast as part of a query. After one million records, Drill reshuffles data rather than doing a broadcast to one side of the join. Range: 0-2147483647
planner.disable_exchanges FALSEToggles the state of hashing to a random exchange.Toggles the state of hashing to a random exchange.
planner.enable_broadcast_join TRUEChanges the state of aggregation and join operators. The broadcast join can be used for hash join, merge join and nested loop join. Use to join a large (fact) table to relatively smaller (dimension) tables. Do not disable.Changes the state of aggregation and join operators. The broadcast join can be used for hash join, merge join and nested loop join. Use to join a large (fact) table to relatively smaller (dimension) tables. Do not disable.
planner.enable_constant_folding TRUEIf one side of a filter condition is a constant expression, constant folding evaluates the expression in the planning phase and replaces the expression with the constant value. For example, Drill can rewrite WHERE age + 5 < 42 as WHERE age < 37.If one side of a filter condition is a constant expression, constant folding evaluates the expression in the planning phase and replaces the expression with the constant value. For example, Drill can rewrite WHERE age + 5 < 42 as WHERE age < 37.
planner.enable_decimal_data_type FALSEFalse disables the DECIMAL data type, including casting to DECIMAL and reading DECIMAL types from Parquet and Hive.False disables the DECIMAL data type, including casting to DECIMAL and reading DECIMAL types from Parquet and Hive.
planner.enable_demux_exchange FALSEToggles the state of hashing to a demulitplexed exchange.Toggles the state of hashing to a demulitplexed exchange.
planner.enable_hash_single_key TRUEEach hash key is associated with a single value.Each hash key is associated with a single value.
planner.enable_hashagg TRUEEnable hash aggregation; otherwise, Drill does a sort-based aggregation. Does not write to disk. Enable is recommended.Enable hash aggregation; otherwise, Drill does a sort-based aggregation. Writes to disk. Enable is recommended.
planner.enable_hashjoin TRUEEnable the memory hungry hash join. Drill assumes that a query will have adequate memory to complete and tries to use the fastest operations possible to complete the planned inner, left, right, or full outer joins using a hash table. Does not write to disk. Disabling hash join allows Drill to manage arbitrarily large data in a small memory footprint.Enable the memory hungry hash join. Drill assumes that a query will have adequate memory to complete and tries to use the fastest operations possible to complete the planned inner, left, right, or full outer joins using a hash table. Does not write to disk. Disabling hash join allows Drill to manage arbitrarily large data in a small memory footprint.
planner.enable_hashjoin_swap TRUEEnables consideration of multiple join order sequences during the planning phase. Might negatively affect the performance of some queries due to inaccuracy of estimated row count especially after a filter, join, or aggregation.Enables consideration of multiple join order sequences during the planning phase. Might negatively affect the performance of some queries due to inaccuracy of estimated row count especially after a filter, join, or aggregation.
planner.enable_hep_join_opt Enables the heuristic planner for joins.Enables the heuristic planner for joins.
planner.enable_mergejoin TRUESort-based operation. A merge join is used for inner join, left and right outer joins. Inputs to the merge join must be sorted. It reads the sorted input streams from both sides and finds matching rows. Writes to disk.Sort-based operation. A merge join is used for inner join, left and right outer joins. Inputs to the merge join must be sorted. It reads the sorted input streams from both sides and finds matching rows. Writes to disk.
planner.enable_multiphase_agg TRUEEach minor fragment does a local aggregation in phase 1, distributes on a hash basis using GROUP-BY keys partially aggregated results to other fragments, and all the fragments perform a total aggregation using this data.Each minor fragment does a local aggregation in phase 1, distributes on a hash basis using GROUP-BY keys partially aggregated results to other fragments, and all the fragments perform a total aggregation using this data.
planner.enable_mux_exchange TRUEToggles the state of hashing to a multiplexed exchange.Toggles the state of hashing to a multiplexed exchange.
planner.enable_nestedloopjoin TRUESort-based operation. Writes to disk.Sort-based operation. Writes to disk.
planner.enable_nljoin_for_scalar_only TRUESupports nested loop join planning where the right input is scalar in order to enable NOT-IN, Inequality, Cartesian, and uncorrelated EXISTS planning.Supports nested loop join planning where the right input is scalar in order to enable NOT-IN, Inequality, Cartesian, and uncorrelated EXISTS planning.
planner.enable_streamagg TRUESort-based operation. Writes to disk.Sort-based operation. Writes to disk.
planner.filter.max_selectivity_estimate_factor 1Available as of Drill 1.8. Sets the maximum filter selectivity estimate. The selectivity can vary between 0 and 1. For more details, see planner.filter.min_selectivity_estimate_factor.Available as of Drill 1.8. Sets the maximum filter selectivity estimate. The selectivity can vary between 0 and 1. For more details, see planner.filter.min_selectivity_estimate_factor.
planner.filter.min_selectivity_estimate_factor 0Available as of Drill 1.8. Sets the minimum filter selectivity estimate to increase the parallelization of the major fragment performing a join. This option is useful for deeply nested queries with complicated predicates and serves as a workaround when statistics are insufficient or unavailable. The selectivity can vary between 0 and 1. The value of this option caps the estimated SELECTIVITY. The estimated ROWCOUNT is derived by multiplying the estimated SELECTIVITY by the estimated ROWCOUNT of the upstream operator. The estimated ROWCOUNT displays when you use the EXPLAIN PLAN INCLUDING ALL ATTRIBUTES FOR command. This option does not control the estimated ROWCOUNT of downstream operators (post FILTER). However, estimated ROWCOUNTs may change because the operator ROWCOUNTs depend on their downstream operators. The FILTER operator relies on the input of its immediate upstream operator, for example SCAN, AGGREGATE. If two filters are present in a plan, e ach filter may have a different estimated ROWCOUNT based on the immediate upstream operator's estimated ROWCOUNT.Available as of Drill 1.8. Sets the minimum filter selectivity estimate to increase the parallelization of the major fragment performing a join. This option is useful for deeply nested queries with complicated predicates and serves as a workaround when statistics are insufficient or unavailable. The selectivity can vary between 0 and 1. The value of this option caps the estimated SELECTIVITY. The estimated ROWCOUNT is derived by multiplying the estimated SELECTIVITY by the estimated ROWCOUNT of the upstream operator. The estimated ROWCOUNT displays when you use the EXPLAIN PLAN INCLUDING ALL ATTRIBUTES FOR command. This option does not control the estimated ROWCOUNT of downstream operators (post FILTER). However, estimated ROWCOUNTs may change because the operator ROWCOUNTs depend on their downstream operators. The FILTER operator relies on the input of its immediate upstream operator, for example SCAN, AGGREGATE. If two filters are present in a plan, e ach filter may have a different estimated ROWCOUNT based on the immediate upstream operator's estimated ROWCOUNT.
planner.identifier_max_length 1024A minimum length is needed because option names are identifiers themselves.A minimum length is needed because option names are identifiers themselves.
planner.join.hash_join_swap_margin_factor 10The number of join order sequences to consider during the planning phase.The number of join order sequences to consider during the planning phase.
planner.join.row_count_estimate_factor 1The factor for adjusting the estimated row count when considering multiple join order sequences during the planning phase.The factor for adjusting the estimated row count when considering multiple join order sequences during the planning phase.
planner.memory.average_field_width 8Used in estimating memory requirements.Used in estimating memory requirements.
planner.memory.enable_memory_estimation FALSEToggles the state of memory estimation and re-planning of the query. When enabled, Drill conservatively estimates memory requirements and typically excludes these operators from the plan and negatively impacts performance.Toggles the state of memory estimation and re-planning of the query. When enabled, Drill conservatively estimates memory requirements and typically excludes these operators from the plan and negatively impacts performance.
planner.memory.hash_agg_table_factor 1.1A heuristic value for influencing the size of the hash aggregation table.A heuristic value for influencing the size of the hash aggregation table.
planner.memory.hash_join_table_factor 1.1A heuristic value for influencing the size of the hash aggregation table.A heuristic value for influencing the size of the hash aggregation table.
planner.memory.max_query_memory_per_node2147483648 bytesSets the maximum amount of direct memory allocated to the sort operator in each query on a node. If a query plan contains multiple sort operators, they all share this memory. If you encounter memory issues when running queries with sort operators, increase the value of this option.2147483648 bytesSets the maximum amount of direct memory allocated to the Sort and Hash Aggregate operators during each query on a node. This memory is split between operators. If a query plan contains multiple Sort and/or Hash Aggregate operators, the memory is divided between them. The default setting is very conservative.
planner.memory.non_blocking_operators_memory 64Extra query memory per node for non-blocking operators. This option is currently used only for memory estimation. Range: 0-2048 MBExtra query memory per node for non-blocking operators. This option is currently used only for memory estimation. Range: 0-2048 MB
planner.memory_limit268435456 bytesDefines the maximum amount of direct memory allocated to a query for planning. When multiple queries run concurrently, each query is allocated the amount of memory set by this parameter.Increase the value of this parameter and rerun the query if partition pruning failed due to insufficient memory.268435456 bytesDefines the maximum amount of direct memory allocated to a query for planning. When multiple queries run concurrently, each query is allocated the amount of memory set by this parameter.Increase the value of this parameter and rerun the query if partition pruning failed due to insufficient memory.
planner.nestedloopjoin_factor 100A heuristic value for influencing the nested loop join.A heuristic value for influencing the nested loop join.
planner.partitioner_sender_max_threads 8Upper limit of threads for outbound queuing.Upper limit of threads for outbound queuing.
planner.partitioner_sender_set_threads -1Overwrites the number of threads used to send out batches of records. Set to -1 to disable. Typically not changed.Overwrites the number of threads used to send out batches of records. Set to -1 to disable. Typically not changed.
planner.partitioner_sender_threads_factor 2A heuristic param to use to influence final number of threads. The higher the value the fewer the number of threads.A heuristic param to use to influence final number of threads. The higher the value the fewer the number of threads.
planner.producer_consumer_queue_size 10How much data to prefetch from disk in record batches out-of-band of query execution. The larger the queue size, the greater the amount of memory that the queue and overall query execution consumes.How much data to prefetch from disk in record batches out-of-band of query execution. The larger the queue size, the greater the amount of memory that the queue and overall query execution consumes.
planner.slice_target 100000The number of records manipulated within a fragment before Drill parallelizes operations.The number of records manipulated within a fragment before Drill parallelizes operations.
planner.width.max_per_node70% of the total number of processors on a nodeMaximum number of threads that can run in parallel for a query on a node. A slice is an individual thread. This number indicates the maximum number of slices per query for the query’s major fragment on a node.70% of the total number of processors on a nodeMaximum number of threads that can run in parallel for a query on a node. A slice is an individual thread. This number indicates the maximum number of slices per query for the query’s major fragment on a node.
planner.width.max_per_query 1000Same as max per node but applies to the query as executed by the entire cluster. For example, this value might be the number of active Drillbits, or a higher number to return results faster.Same as max per node but applies to the query as executed by the entire cluster. For example, this value might be the number of active Drillbits, or a higher number to return results faster.
security.admin.user_groups n/aUnsupported as of 1.4. A comma-separated list of administrator groups for Web Console security.Unsupported as of 1.4. A comma-separated list of administrator groups for Web Console security.
security.admin.users Unsupported as of 1.4. A comma-separated list of user names who you want to give administrator privileges.Unsupported as of 1.4. A comma-separated list of user names who you want to give administrator privileges.
store.format parquetOutput format for data written to tables with the CREATE TABLE AS (CTAS) command. Allowed values are parquet, json, psv, csv, or tsv.Output format for data written to tables with the CREATE TABLE AS (CTAS) command. Allowed values are parquet, json, psv, csv, or tsv.
store.hive.optimize_scan_with_native_readers FALSEOptimize reads of Parquet-backed external tables from Hive by using Drill native readers instead of the Hive Serde interface. (Drill 1.2 and later)Optimize reads of Parquet-backed external tables from Hive by using Drill native readers instead of the Hive Serde interface. (Drill 1.2 and later)
store.json.all_text_mode FALSEDrill reads all data from the JSON files as VARCHAR. Prevents schema change errors.Drill reads all data from the JSON files as VARCHAR. Prevents schema change errors.
store.json.extended_types FALSETurns on special JSON structures that Drill serializes for storing more type information than the four basic JSON types.Turns on special JSON structures that Drill serializes for storing more type information than the four basic JSON types.
store.json.read_numbers_as_double FALSEReads numbers with or without a decimal point as DOUBLE. Prevents schema change errors.Reads numbers with or without a decimal point as DOUBLE. Prevents schema change errors.
store.mongo.all_text_mode FALSESimilar to store.json.all_text_mode for MongoDB.Similar to store.json.all_text_mode for MongoDB.
store.mongo.read_numbers_as_double FALSESimilar to store.json.read_numbers_as_double.Similar to store.json.read_numbers_as_double.
store.parquet.block-size 536870912Sets the size of a Parquet row group to the number of bytes less than or equal to the block size of MFS, HDFS, or the file system.Sets the size of a Parquet row group to the number of bytes less than or equal to the block size of MFS, HDFS, or the file system.
store.parquet.compression snappyCompression type for storing Parquet output. Allowed values: snappy, gzip, noneCompression type for storing Parquet output. Allowed values: snappy, gzip, none
store.parquet.enable_dictionary_encoding FALSEFor internal use. Do not change.For internal use. Do not change.
store.parquet.dictionary.page-size
store.parquet.reader.int96_as_timestamp FALSEEnables Drill to implicitly interpret the INT96 timestamp data type in Parquet files.Enables Drill to implicitly interpret the INT96 timestamp data type in Parquet files.
store.parquet.use_new_reader FALSENot supported in this release.Not supported in this release.
store.partition.hash_distribute FALSEUses a hash algorithm to distribute data on partition keys in a CTAS partitioning operation. An alpha option--for experimental use at this stage. Do not use in production systems.Uses a hash algorithm to distribute data on partition keys in a CTAS partitioning operation. An alpha option--for experimental use at this stage. Do not use in production systems.
store.text.estimated_row_size_bytes 100Estimate of the row size in a delimited text file, such as csv. The closer to actual, the better the query plan. Used for all csv files in the system/session where the value is set. Impacts the decision to plan a broadcast join or not.Estimate of the row size in a delimited text file, such as csv. The closer to actual, the better the query plan. Used for all csv files in the system/session where the value is set. Impacts the decision to plan a broadcast join or not.
window.enable TRUEEnable or disable window functions in Drill 1.1 and later.Enable or disable window functions in Drill 1.1 and later.
http://git-wip-us.apache.org/repos/asf/drill-site/blob/7ecab1e6/feed.xml ---------------------------------------------------------------------- diff --git a/feed.xml b/feed.xml index b0eecf5..764491a 100644 --- a/feed.xml +++ b/feed.xml @@ -6,8 +6,8 @@ / - Thu, 17 Aug 2017 14:23:20 -0700 - Thu, 17 Aug 2017 14:23:20 -0700 + Thu, 17 Aug 2017 14:50:10 -0700 + Thu, 17 Aug 2017 14:50:10 -0700 Jekyll v2.5.2