phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-2758) Ordered GROUP BY not occurring with leading PK equality expression
Date Sat, 02 Apr 2016 08:28:25 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15222788#comment-15222788
] 

Hudson commented on PHOENIX-2758:
---------------------------------

FAILURE: Integrated in Phoenix-master #1181 (See [https://builds.apache.org/job/Phoenix-master/1181/])
PHOENIX-2758 Ordered GROUP BY not occurring with leading PK equality (jtaylor: rev 838a60b9abfb9b65363db9e51cdb6bc32d2088c9)
* phoenix-core/src/main/java/org/apache/phoenix/compile/QueryCompiler.java
* phoenix-core/src/main/java/org/apache/phoenix/compile/GroupByCompiler.java
* phoenix-core/src/test/java/org/apache/phoenix/compile/QueryCompilerTest.java
* phoenix-core/src/it/java/org/apache/phoenix/end2end/GroupByCaseIT.java


> Ordered GROUP BY not occurring with leading PK equality expression
> ------------------------------------------------------------------
>
>                 Key: PHOENIX-2758
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2758
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: James Taylor
>             Fix For: 4.8.0
>
>         Attachments: PHOENIX-2758.patch, PHOENIX-2758_wip.patch
>
>
> The following query:
> {code}
> SELECT SUM(DUP_COUNT) FROM (
>     SELECT COUNT(1) As DUP_COUNT
>     FROM DATACLOUD.DATA_ASSESSMENT_RECORD 
>    WHERE JOURNEY_ID='07ixx000000004J' AND 
>                  DATASOURCE=0 AND MATCH_STATUS <= 1 and 
>                  ORGANIZATION_ID='07ixx000000004J' 
>     GROUP BY MATCH_STATUS, EXTERNAL_DATASOURCE_KEY 
>     HAVING COUNT(1) > 1);
> {code}
> Should use an ORDERED DISTINCT, but doesn't.
> This is the DDL:
> {code}
> CREATE TABLE DATACLOUD.DATA_ASSESSMENT_RECORD (ORGANIZATION_ID char(15) not null, 
> JOURNEY_ID char(15) not null, 
> DATASOURCE SMALLINT not null, 
> MATCH_STATUS TINYINT not null, 
> EXTERNAL_DATASOURCE_KEY varchar(30), 
> ENTITY_ID char(15) not null, 
> CONSTRAINT PK PRIMARY KEY (
>     ORGANIZATION_ID, 
>     JOURNEY_ID, 
>     DATASOURCE, 
>     MATCH_STATUS,
>     EXTERNAL_DATASOURCE_KEY,
>     ENTITY_ID))
> {code}
> The optimization does occur if you include the PK columns in the GROUP BY like this:
> {code}
> SELECT SUM(DUP_COUNT) FROM (
>     SELECT COUNT(1) As DUP_COUNT
>     FROM DATACLOUD.DATA_ASSESSMENT_RECORD 
>    WHERE JOURNEY_ID='07ixx000000004J' AND 
>                  DATASOURCE=0 AND MATCH_STATUS <= 1 and 
>                  ORGANIZATION_ID='07ixx000000004J' 
>     GROUP BY ORGANIZATION_ID, JOURNEY_ID, DATASOURCE, MATCH_STATUS, EXTERNAL_DATASOURCE_KEY

>     HAVING COUNT(1) > 1);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message