phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-3516) Performance Issues with queries that have compound filters and specify phoenix.query.force.rowkeyorder=true
Date Thu, 22 Dec 2016 20:21:58 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15771007#comment-15771007
] 

Hudson commented on PHOENIX-3516:
---------------------------------

SUCCESS: Integrated in Jenkins build Phoenix-master #1522 (See [https://builds.apache.org/job/Phoenix-master/1522/])
PHOENIX-3516 Performance Issues with queries that have compound filters (tdsilva: rev c5046047a78e0365d75bc696dff4870304c2b5b2)
* (edit) phoenix-core/src/main/java/org/apache/phoenix/util/ExpressionUtil.java
* (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/TenantSpecificViewIndexIT.java
* (edit) phoenix-core/src/test/java/org/apache/phoenix/query/KeyRangeIntersectTest.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/WhereCompiler.java


> Performance Issues with queries that have compound filters and specify phoenix.query.force.rowkeyorder=true
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-3516
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3516
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.9.0
>            Reporter: Jan Fernando
>            Assignee: Thomas D'Silva
>             Fix For: 4.9.1, 4.10.0
>
>         Attachments: PHOENIX-3516.patch
>
>
> On all our connections we specify the phoenix.query.force.rowkeyorder=true property to
force serial scans so that we only support queries that will scale horizontally with data
size. 
> In running performance tests, we found that queries with multiple AND'ed range filters
were slow and not performing not as expected. We looked at the query plan and noticed that,
in the slow query case, the query plan is doing a PARALLEL 1-WAY SCAN and doing a SERVER FILTER
BY whereas the fast query is simply doing a SERIAL 1-WAY RANGE SCAN.
> We expect these queries to both have the same plan as we are specifying phoenix.query.force.rowkeyorder=true.
> You can repro as follows:
> 1. Use non-tenant specific connection to create the table:
> CREATE TABLE IF NOT EXISTS MY_MT_TABLE.TEST_TABLE (
>     ORGANIZATION_ID CHAR(15) NOT NULL,
>     PARENT_TYPE CHAR(3) NOT NULL,
>     PARENT_ID CHAR(15) NOT NULL,
>     CREATED_DATE DATE NOT NULL
>     CONSTRAINT PK PRIMARY KEY 
>     (
>         ORGANIZATION_ID, 
>         PARENT_TYPE,
>         PARENT_ID,
>         CREATED_DATE DESC
>     )
> ) VERSIONS=1,MULTI_TENANT=true,REPLICATION_SCOPE=1;
> 2. Use non-tenant specific connection to execute index:
> CREATE INDEX IF NOT EXISTS MY_TEST_TABLE_INDEX 
> ON MY_MT_TABLE.TEST_TABLE (PARENT_TYPE, CREATED_DATE, PARENT_ID);
> 3. Use a tenant-specific connection to create the View:
> CREATE VIEW IF NOT EXISTS MY_TEST_TABLE_VIEW AS SELECT * FROM MY_MT_TABLE.TEST_TABLE;
> 4. Run queries below with tenant-specific connection:
> Query with expected plan:
> EXPLAIN SELECT PARENT_ID
> FROM MY_TEST_TABLE_VIEW
> WHERE PARENT_TYPE='001'
> AND (CREATED_DATE > to_date('2012-10-21') AND CREATED_DATE < to_date('2016-10-31'))
> ORDER BY PARENT_TYPE,CREATED_DATE LIMIT 501;
> +------------------------------------------+
> |                   PLAN                   |
> +------------------------------------------+
> | CLIENT 1-CHUNK SERIAL 1-WAY RANGE SCAN OVER MY_MT_TABLE.MY_TEST_TABLE_INDEX ['00Dxx0000001gFA','001','2012-10-21
00:00:00.001'] - ['00Dxx0000001gFA','001',' |
> |     SERVER FILTER BY FIRST KEY ONLY      |
> |     SERVER 501 ROW LIMIT                 |
> | CLIENT 501 ROW LIMIT                     |
> +------------------------------------------+
> Slow query with unexpected plan. Since the date range are overlapping we expected Phoenix
to consolidate this into a the smallest matching range and do a range scan. It does seem to
do the consolidation but then do a parallel and not a range scan.
> EXPLAIN SELECT PARENT_ID
> FROM MY_TEST_TABLE_VIEW
> WHERE PARENT_TYPE='001'
> AND (CREATED_DATE >= to_date('2011-01-01') AND CREATED_DATE <= to_date('2016-01-01'))
> AND (CREATED_DATE > to_date('2012-10-21') AND CREATED_DATE < to_date('2016-10-31'))
> ORDER BY PARENT_TYPE,CREATED_DATE LIMIT 501;
> +------------------------------------------+
> |                   PLAN                   |
> +------------------------------------------+
> | CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER MY_MT_TABLE.MY_TEST_TABLE_INDEX ['00Dxx0000001gFA','001','2012-10-21
00:00:00.001'] - ['00Dxx0000001gFA','001' |
> |     SERVER FILTER BY FIRST KEY ONLY AND (true AND true) |
> |     SERVER 501 ROW LIMIT                 |
> | CLIENT 501 ROW LIMIT                     |
> +------------------------------------------+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message