http://git-wip-us.apache.org/repos/asf/incubator-trafodion-site/blob/f0969233/docs/2.0.0/cqd_reference/index.html ---------------------------------------------------------------------- diff --git a/docs/2.0.0/cqd_reference/index.html b/docs/2.0.0/cqd_reference/index.html new file mode 100644 index 0000000..09c6db2 --- /dev/null +++ b/docs/2.0.0/cqd_reference/index.html @@ -0,0 +1,4916 @@ + + + + + + + +Control Query Default (CQD) Reference Guide + + + + + + +
+
+
+
+

License Statement

+
+
+

Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file +distributed with this work for additional information regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with +the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

+
+
+

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the +specific language governing permissions and limitations under the License.

+
+ ++++ + + + + + + + + + + + + +
VersionDate

1.3.0

January, 2016

+
+
+
+

1. About This Document

+
+
+

This guide describes Trafodion Control Query Defaults (CQDs) that are used to override +system-level default settings.

+
+
+

1.1. Intended Audience

+
+

This guide is intended for database administrators and application programmers who want to +use CQDs to override system-default settings when querying a Trafodion database.

+
+
+
+

1.2. New and Changed Information

+
+

This is a new guide.

+
+
+
+

1.3. Notation Conventions

+
+

This list summarizes the notation conventions for syntax presentation in this manual.

+
+
+
    +
  • +

    UPPERCASE LETTERS

    +
    +

    Uppercase letters indicate keywords and reserved words. Type these items exactly as shown. Items not enclosed in brackets are required.

    +
    +
    +
    +
    SELECT
    +
    +
    +
  • +
  • +

    lowercase letters

    +
    +

    Lowercase letters, regardless of font, indicate variable items that you supply. Items not enclosed in brackets are required.

    +
    +
    +
    +
    file-name
    +
    +
    +
  • +
  • +

    [ ] Brackets

    +
    +

    Brackets enclose optional syntax items.

    +
    +
    +
    +
    DATETIME [start-field TO] end-field
    +
    +
    +
    +

    A group of items enclosed in brackets is a list from which you can choose one item or none.

    +
    +
    +

    The items in the list can be arranged either vertically, with aligned brackets on each side of the list, or horizontally, enclosed in a pair of brackets and separated by vertical lines.

    +
    +
    +

    For example:

    +
    +
    +
    +
    DROP SCHEMA schema [CASCADE]
    +DROP SCHEMA schema [ CASCADE | RESTRICT ]
    +
    +
    +
  • +
  • +

    { } Braces

    +
    +

    Braces enclose required syntax items.

    +
    +
    +
    +
    FROM { grantee [, grantee ] ... }
    +
    +
    +
    +

    A group of items enclosed in braces is a list from which you are required to choose one item.

    +
    +
    +

    The items in the list can be arranged either vertically, with aligned braces on each side of the list, or horizontally, enclosed in a pair of braces and separated by vertical lines.

    +
    +
    +

    For example:

    +
    +
    +
    +
    INTERVAL { start-field TO end-field }
    +{ single-field }
    +INTERVAL { start-field TO end-field | single-field }
    +
    +
    +
  • +
  • +

    | Vertical Line

    +
    +

    A vertical line separates alternatives in a horizontal list that is enclosed in brackets or braces.

    +
    +
    +
    +
    {expression | NULL}
    +
    +
    +
  • +
  • +

    … Ellipsis

    +
    +

    An ellipsis immediately following a pair of brackets or braces indicates that you can repeat the enclosed sequence of syntax items any number of times.

    +
    +
    +
    +
    ATTRIBUTE[S] attribute [, attribute] ...
    +{, sql-expression } ...
    +
    +
    +
    +

    An ellipsis immediately following a single syntax item indicates that you can repeat that syntax item any number of times.

    +
    +
    +

    For example:

    +
    +
    +
    +
    expression-n ...
    +
    +
    +
  • +
  • +

    Punctuation

    +
    +

    Parentheses, commas, semicolons, and other symbols not previously described must be typed as shown.

    +
    +
    +
    +
    DAY (datetime-expression)
    +@script-file
    +
    +
    +
    +

    Quotation marks around a symbol such as a bracket or brace indicate the symbol is a required character that you must type as shown.

    +
    +
    +

    For example:

    +
    +
    +
    +
    "{" module-name [, module-name] ... "}"
    +
    +
    +
  • +
  • +

    Item Spacing

    +
    +

    Spaces shown between items are required unless one of the items is a punctuation symbol such as a parenthesis or a comma.

    +
    +
    +
    +
    DAY (datetime-expression) DAY(datetime-expression)
    +
    +
    +
    +

    If there is no space between two items, spaces are not permitted. In this example, no spaces are permitted between the period and any other items:

    +
    +
    +
    +
    myfile.sh
    +
    +
    +
  • +
  • +

    Line Spacing

    +
    +

    If the syntax of a command is too long to fit on a single line, each continuation line is indented three spaces and is separated from the preceding line by a blank line.

    +
    +
    +

    This spacing distinguishes items in a continuation line from items in a vertical list of selections.

    +
    +
    +
    +
    match-value [NOT] LIKE _pattern
    +   [ESCAPE esc-char-expression]
    +
    +
    +
  • +
+
+
+
+

1.4. Publishing History

+ ++++ + + + + + + + + + + + + +
Product VersionPublication Date

Trafodion Release 1.3.0

January 2016.

+
+
+

1.5. Comments Encouraged

+
+

The Trafodion community encourages your comments concerning this document. We are committed to providing documentation that meets your +needs. Send any errors found, suggestions for improvement, or compliments to:

+
+ +
+

Include the document title and any comment, error found, or suggestion for improvement you have concerning this document. Or, even +better, join our community and help us improve our documentation. Please refer to +Trafodion Contributor Guide for details.

+
+
+
+
+
+

2. Introduction

+
+
+

Refer to the +Trafodion SQL Reference Manual +full documentation of the CQD (CONTROL QUERY DEFAULT) statement.

+
+
+

The CONTROL QUERY DEFAULT statement changes the default settings for the current process. You can execute +the CONTROL QUERY DEFAULT statement in a client-based tool like TrafCI or through any ODBC or JDBC application.

+
+
+

Syntax

+
+
+
+
{ CONTROL QUERY DEFAULT | CQD } control-default-option
+
+control-default-option is:
+  attribute {'attr-value' | RESET}
+
+
+
+

The result of the execution of a CONTROL QUERY DEFAULT statement stays in effect until the current process +terminates or until the execution of another statement for the same attribute overrides it. +CQDs are applied at compile time, so CQDs do not affect any statements that are already prepared. For example:

+
+
+
+
PREPARE x FROM SELECT * FROM t;
+CONTROL QUERY DEFAULT SCHEMA 'myschema';
+EXECUTE x;                              -- uses the default schema SEABASE
+SELECT * FROM t2;                       -- uses MYSCHEMA;
+PREPARE y FROM SELECT * FROM t3;
+CONTROL QUERY DEFAULT SCHEMA 'seabase';
+EXECUTE y;                              -- uses MYSCHEMA;
+
+
+
+

Examples

+
+
+
    +
  • +

    Change the maximum supported length of the column names to 200 for the current process:

    +
    +
    +
    CONTROL QUERY DEFAULT HBASE_MAX_COLUMN_NAME_LENGTH '200';
    +
    +
    +
  • +
  • +

    Reset the HBASE_MAX_COLUMN_NAME_LENGTH attribute to its initial value in the current process:

    +
    +
    +
    CONTROL QUERY DEFAULT HBASE_MAX_COLUMN_NAME_LENGTH RESET;
    +
    +
    +
  • +
+
+
+
+

2.1. CQD Descriptions

+
+

The following information is provided for each CQD:

+
+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Description

Describes the purpose of the CQD.

Values

Identifies this information:
+
+• Values, in the form of a character string, that specify the applicable attribute values for the CQD.
+• The default attribute value.
+• If applicable, the Trafodion release in which the attribute values or default changed.

Usage

Describes the conditions when the CQD is helpful, and how to detect the conditions.

Production Usage

Identifies when the CQD is not safe to be used as a permanent setting in production.

Impact

Describes any positive and negative implications of using the CQD.

Level

Indicates one of these levels at which the CQD should be used:
+
+• Query
+• Session
+• Service
+• Any
+
+NOTE: This level indicates that the CQD can be used at the Query, Session or Service level as long as you fully understands the scope of the impact of the CQD.

Conflicts/Synergies

Describes CQDs that are in conflict with or can be used in conjunction with the CQD.

Real Problem Addressed

Describes any design or solution that the CQD may be a workaround for and how you can directly address the real problem.

Introduced In Release

Indicates the Trafodion release when the CQD was introduced.

Deprecated In Release

Indicates in what release the CQD was deprecated.

+
+
+
+
+

3. Query Plans

+
+
+

This section describes CQDs that are used to influence query plans.

+
+
+

3.1. DEFAULT_DEGREE_OF_PARALLELISM

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Description

Defines the minimum size of the adaptive segment; that is, the number of processors +available for query operator parallelism. The optimizer may choose an adaptive-segment size that is equal to, or the multiple of, +the value of this CQD depending on the maximum estimated resource consumed by any single operator in the query. +The optimizer may also decide to run the query with no parallelism if the resource consumption estimate is very low.

Values

Unsigned Integer.
+
+The default value is '16'.

Usage

For systems running at higher levels of concurrency with workloads that include a large +number of small queries, reducing the default degree of parallelism may help achieve higher throughput.
+
+With the default of 16, for 32-node systems, adaptive segmentation can use two 16-node virtual segments to execute queries that +do not require a degree of parallelism of 32. This default setting can, for example, be changed to 8 for a 16-node system, +to allow adaptive segmentation to leverage a lower degree of parallelism.

Production Usage

Not applicable.

Impact

Lowering the value of this CQD can increase the throughput of +high-concurrency small-query workloads, but has the potential disadvantage of increasing the +elapsed time for some of the longer running queries that leverage adaptive segmentation.

Level

System. There may be scenarios where you want to influence the degree of adaptive segmentation +parallelism only for a certain set of queries and use it at the service level.

Conflicts/Synergies

Not applicable.

Real Problem Addressed

Not applicable.

Introduced In Release

Trafodion 1.3.0.

Deprecated In Release

Not applicable.

+
+
+
+

3.2. HASH_JOINS

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Description

Determines whether the Trafodion Optimizer considers Hash Join when generating an execution plan.

Values

'ON' Hash Join is considered.
+'OFF' Hash Join is not considered.
+
+The default value is 'ON'.

Usage

Use this CQD when you want to force the optimizer to generate a query plan that does not use any Hash Joins.

Production Usage

Hash Join is an important join implementation strategy for most complex queries. +It is highly recommended that you do not turn HASH_JOINS OFF; that is, this CQD should be used to force a query plan for a particular query on an exception basis only.

Impact

Turning HASH_JOINS OFF may result in very inefficient query plans that use expensive nested joins or sorts for merge joins.

Level

Query.

Conflicts/Synergies

If you turn all three join implementations OFF (Hash Joins, Nested Joins, and Merge Joins), then the compiler may fail to generate query plans.

Real Problem Addressed

Not applicable.

Introduced In Release

Trafodion 1.3.0.

Deprecated In Release

Not applicable.

+
+
+
+

3.3. HBASE_COPROCESSORS

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Description

Allow HBase coprocessors to be used when computing aggregates.

Values

'ON': Use HBase coprocessors.
+'OFF': Do not use HBase coprocessors.
+
+The default value is 'ON'.

Usage

Enables Trafodion to use HBase coprocessors to do early aggregation and filtering +at the HBase Region Server level. This CQD does not affect Transaction coprocessors used by Trafodion.
+
+As of Release 2.0 only COUNT(*) queries will be affected by this attribute.

Production Usage

Yes.

Impact

Network traffic between Region Server and Trafodion processes is reduced but +the Region Server can become very busy when aggregating over large tables.

Level

Query.

Conflicts/Synergies

Not applicable.

Real Problem Addressed

Not applicable.

Introduced In Release

Trafodion 1.3.0.

Deprecated In Release

Not applicable.

+
+
+
+

3.4. HIVE_NUM_ESPS_PER_DATANODE

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Description

Determines number of ESP processes used to scan a Hive table, per HDFS DataNode.

Values

Positive integer.
+
+The default value is '2'.

Usage

Use this CQD to increase or decrease the number of scanners that process a single Hive table. +If a Hive scan is found to be the bottleneck for a particular query, then increasing this attribute to; for example, 4 or higher will help. +On the other hand decreasing the attribute to 1 could help with concurrency.

Production Usage

Yes.

Impact

Controls number of ESPs and, therefore, affects query execution time and system workload.

Level

Query.

Conflicts/Synergies

The CQD HIVE_MIN_BYTES_PER_ESP_PARTITION (default = 67108864) may need to be adjusted downward +when this attribute is used to increase the parallelism of scanning smaller Hive tables.

Real Problem Addressed

Not applicable.

Introduced In Release

Trafodion 1.3.0.

Deprecated In Release

Not applicable.

+
+
+
+

3.5. JOIN_ORDER_BY_USER

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Description

Enables or disables the join order in which the optimizer joins the tables to be the sequence of the +tables in the FROM clause of the query.

Values

'ON' Join order is forced.
+'OFF' Join order is decided by the optimizer.
+
+The default value is 'OFF'.

Usage

When set to ON, the optimizer considers only execution plans that have the join order matching +the sequence of the tables in the FROM clause.

Production Usage

This setting is to be used for forcing a desired join order that was not generated by default by +the optimizer only. It can be used as a workaround for query plans with inefficient join order.

Impact

Because you are in effect forcing the optimizer to use a plan that joins the table in the order +specified in the FROM clause, the plan generated may not be the optimal one.

Level

Query.

Conflicts/Synergies

Not applicable.

Real Problem Addressed

Not applicable.

Introduced In Release

Trafodion 1.3.0.

Deprecated In Release

Not applicable.

+
+
+
+

3.6. MC_SKEW_SENSITIVITY_THRESHOLD

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Description

Define the multi-column skew sensitivity threshold T used by multi-column skew-insensitive hash join (Skew Buster).
+
+Let f be the occurrence frequency of a skew value v, DoP be the degree of parallelism of a hash join operator, and RC be the row count of +the source data (for example, fact table) where the skew originates.
+
+The hash join will run in the anti-skew mode for v if f >= T * DoP / RC.

Values

< 0: Disable the multi-column skew buster.
+>= 0: Define the threshold T.
+
+Default value: 0.1.

Usage

Use of a negative value to disable multi-column anti-skew hash joins. This may slow down query performance when +multi-column skew values are present in the fact table.
+
+A value of 0 treats every multi-column value as skew values. This may increase network traffic since skewed values are broadcasted from the inner +side child of the hash join to all join processes.
+
+A value greater than 0 selects those multi-column values as skewed values if their occurrence frequencies are high enough.

Production Usage

Consult the Trafodion community.

Impact

This CQD impacts runtime performance.

Level

Session.

Conflicts/Synergies

Not applicable.

Real Problem Addressed

Not applicable.

Introduced In Release

Trafodion 1.3.0.

Deprecated In Release

Not applicable.

+
+
+
+

3.7. MDAM_NO_STATS_POSITIONS_THRESHOLD

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Description

This CQD effects the behavior of the query optimizer when there are no statistics available for a query +having range predicates on key columns. The Trafodion Optimizer calculates the worst case number of seeks that the MDAM access method +would do if chosen for the query.
+
+If this number is greater than the value of MDAM_NO_STATS_POSITIONS_THRESHOLD, then MDAM is not considered for the query execution plan.

Values

Any integer greater than equal to zero.
+
+Default value: 10.

Usage

In certain situations, queries on tables lacking statistics may not be optimal because MDAM was not chosen. +Increasing the value for this CQD allows MDAM to be chosen in more cases. On the other hand, if the value is made too high and the worst case +scenario actually occurs, an MDAM plan may perform poorly.

Production Usage

Consult the Trafodion community.

Impact

Table scans on tables lacking statistics may improve by varying the value of this CQD. +Results vary depending on the actual data in the table and the semantics of the query.

Level

Query.

Conflicts/Synergies

If MDAM_SCAN_METHOD is set to 'OFF', then this CQD has no effect.

Real Problem Addressed

Perform UPDATE STATISTICS on the table (at the very least on key columns) to obtain statistics.

Introduced In Release

Trafodion 1.3.0.

Deprecated In Release

Not applicable.

+
+
+
+

3.8. MDAM_SCAN_METHOD

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Description

Enables or disables the Multi-Dimensional Access Method (MDAM).

Values

'ON' MDAM is considered.
+'OFF' MDAM is disabled.
+
+The default value is 'ON'.

Usage

In certain situations, the Trafodion Optimizer might choose MDAM inappropriately, causing poor performance. +In such situations you may want to turn MDAM OFF for the query it is effecting.

Production Usage

Yes.

Impact

Table scans with predicates on non-leading clustering key column(s) could benefit from MDAM access +method if the leading column(s) has a small number of distinct values. Turning MDAM off results in a longer scan time for such queries.

Level

Set this CQD at the query level when MDAM is not working efficiently for a specific query. However, +there may be cases (usually a defect) where a larger set of queries is being negatively impacted by MDAM. In those cases you may want +to set it at the service or system level.

Conflicts/Synergies

Not applicable.

Real Problem Addressed

Not applicable.

Introduced In Release

Trafodion 1.3.0.

Deprecated In Release

Not applicable.

+
+
+
+

3.9. MERGE_JOINS

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Description

Determines if Merge Join is considered by the optimizer to generate an execution plan.

Values

'ON': Merge Join is considered.
+'OFF': Merge Join is disabled.
+
+The default value is 'ON'.

Usage

Use this CQD when you want to force a query plan not to use Merge Joins. This is useful as a workaround +for query plans with very expensive sorts for Merge Joins. Turning MERGE_JOINS OFF also has the advantage of reducing the query compile time.

Production Usage

Merge Join is an efficient join implementation strategy if the physical schema was designed to take advantage +of it. For example, large tables are physically ordered based on the most frequently joined column(s).

Impact

Turning MERGE_JOINS OFF may result in the optimizer not considering potentially efficient query plans, +for queries with large joins on tables that are physically ordered by the join column(s).
+
+Turning MERGE_JOINS ON causes an increase in compile +time because the optimizer now has to consider many more join options.

Level

Set this CQD at the query level when a Merge Join is not working efficiently for a specific query. However, +there may be cases (usually a defect) where a larger set of queries is being negatively impacted by Merge Joins. In those cases you may want to +set it at the service or system level.

Conflicts/Synergies

Avoid turning all the three join implementations OFF (Hash Joins, Nested Joins, and Merge Joins). +This may result in the Trafodion Compiler failing to generate query plans.

Real Problem Addressed

Not applicable.

Introduced In Release

Trafodion 1.3.0.

Deprecated In Release

Not applicable.