From dev-return-55523-archive-asf-public=cust-asf.ponee.io@phoenix.apache.org Thu Feb 21 02:12:06 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 5640D18075F for ; Thu, 21 Feb 2019 03:12:05 +0100 (CET) Received: (qmail 42318 invoked by uid 500); 21 Feb 2019 02:12:04 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 42300 invoked by uid 99); 21 Feb 2019 02:12:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Feb 2019 02:12:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id D845618AFB7 for ; Thu, 21 Feb 2019 02:12:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id uPSP50sUjzTz for ; Thu, 21 Feb 2019 02:12:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 0CA91610F0 for ; Thu, 21 Feb 2019 02:12:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 45CF7E00A9 for ; Thu, 21 Feb 2019 02:12:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 0798724511 for ; Thu, 21 Feb 2019 02:12:00 +0000 (UTC) Date: Thu, 21 Feb 2019 02:12:00 +0000 (UTC) From: "chenglei (JIRA)" To: dev@phoenix.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (PHOENIX-5148) Improve OrderPreservingTracker to optimize OrderBy/GroupBy for ClientScanPlan and ClientAggregatePlan MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PHOENIX-5148?page=3Dcom.atlass= ian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-5148: ------------------------------ Description:=20 Given a table {code:java} create table test (=20 pk1 varchar not null ,=20 pk2 varchar not null,=20 pk3 varchar not null, v1 varchar,=20 v2 varchar,=20 CONSTRAINT TEST_PK PRIMARY KEY (=20 pk1, pk2, pk3 )) {code} Consider following three cases: *1. OrderBy of ClientScanPlan* for sql: {code:java} select v1 from (select v1,v2,pk3 from test t where pk1 =3D '6' order by t.v= 2,t.pk3,t.v1 limit 10) a order by v2,pk3 {code} Obviously, the outer {{OrderBy}} "order by v2,pk3" should be compiled out b= ecause it matchs the inner query {{OrderBy}} "order by t.v2,t.pk3,t.v1" , b= ut unfortunately it is not compiled out. *2. GroupBy of ClientAggregatePlan* for sql : {code:java} select v1 from (select v1,pk2,pk1 from test t where pk1 =3D '6' order by t.= pk2,t.v1,t.pk1 limit 10) a group by pk2,v1 {code} Obviously, the outer {{GroupBy}} "group by pk2,v1" should be orderPreservin= g because it matchs the inner query {{OrderBy}} "order by t.pk2,t.v1,t.pk1"= , but unfortunately the {{isOrderPreserving()}} of outer {{GroupBy}} retur= n false. *3. OrderBy of SortMergeJoinPlan(from PHOENIX-4618)* for sql: {code:java} SELECT * FROM T1 JOIN T2 ON T1.a =3D T2.a and T1.b =3D T2.b{code} The result of the sort-merge-join is sorted on (T1.a, T1.b) and (T2.a, T2.b= ) at the same time. Thus, both 1) {code:java} SELECT * FROM T1 JOIN T2 ON T1.a =3D T2.a and T1.b =3D T2.b ORDER BY T1.a, = T1.b{code} and 2) {code:java} SELECT * FROM T1 JOIN T2 ON T1.a =3D T2.a and T1.b =3D T2.b ORDER BY T2.a, = T2.b{code} should avoid doing an extra order-by after the sort-merge-join operation. =C2=A0 All the above three cases are caused by the same problem that the {{OrderPr= eservingTracker}} relies solely on primary keys for inferring alignment bet= ween the target \{{OrderByExpression}}s and the source sortedness. =C2=A0 was: Given a table {code:java} create table test (=20 pk1 varchar not null ,=20 pk2 varchar not null,=20 pk3 varchar not null, v1 varchar,=20 v2 varchar,=20 CONSTRAINT TEST_PK PRIMARY KEY (=20 pk1, pk2, pk3 )) {code} Consider following three cases: *1. OrderBy of ClientScanPlan* for sql: {code:java} select v1 from (select v1,v2,pk3 from test t where pk1 =3D '6' order by t.v= 2,t.pk3,t.v1 limit 10) a order by v2,pk3 {code} Obviously, the outer {{OrderBy}} "order by v2,pk3" should be compiled out b= ecause it matchs the inner query {{OrderBy}} "order by t.v2,t.pk3,t.v1" , b= ut unfortunately it is not compiled out. *2. GroupBy of ClientAggregatePlan* for sql : {code:java} select v1 from (select v1,pk2,pk1 from test t where pk1 =3D '6' order by t.= pk2,t.v1,t.pk1 limit 10) a group by pk2,v1 {code} Obviously, the outer {{GroupBy}} "group by pk2,v1" should be orderPreservin= g because it matchs the inner query {{OrderBy}} "order by t.pk2,t.v1,t.pk1"= , but unfortunately the {{isOrderPreserving()}} of outer {{GroupBy}} retur= n false. *3. OrderBy of SortMergeJoinPlan(from PHOENIX-4618)* for sql: {code}SELECT * FROM T1 JOIN T2 ON T1.a =3D T2.a and T1.b =3D T2.b{code} The result of the sort-merge-join is sorted on (T1.a, T1.b) and (T2.a, T2.b= ) at the same time. Thus, both 1) {code}SELECT * FROM T1 JOIN T2 ON T1.a =3D T2.a and T1.b =3D T2.b ORDER BY = T1.a, T1.b{code} and 2) {code}SELECT * FROM T1 JOIN T2 ON T1.a =3D T2.a and T1.b =3D T2.b ORDER BY = T2.a, T2.b{code} should avoid doing an extra order-by after the sort-merge-join operation. All the above three cases are caused by the same problem that the {{OrderPr= eservingTracker}} relies solely on primary keys for inferring alignment bet= ween the target {{OrderByExpression}}s and the source sortedness. =C2=A0 > Improve OrderPreservingTracker to optimize OrderBy/GroupBy for ClientScan= Plan and ClientAggregatePlan > -------------------------------------------------------------------------= ---------------------------- > > Key: PHOENIX-5148 > URL: https://issues.apache.org/jira/browse/PHOENIX-5148 > Project: Phoenix > Issue Type: Improvement > Affects Versions: 4.14.1 > Reporter: chenglei > Priority: Major > > Given a table > {code:java} > create table test (=20 > pk1 varchar not null ,=20 > pk2 varchar not null,=20 > pk3 varchar not null, > v1 varchar,=20 > v2 varchar,=20 > CONSTRAINT TEST_PK PRIMARY KEY (=20 > pk1, > pk2, > pk3 )) > {code} > Consider following three cases: > *1. OrderBy of ClientScanPlan* > for sql: > {code:java} > select v1 from (select v1,v2,pk3 from test t where pk1 =3D '6' order by t= .v2,t.pk3,t.v1 limit 10) a order by v2,pk3 > {code} > Obviously, the outer {{OrderBy}} "order by v2,pk3" should be compiled out= because it matchs the inner query {{OrderBy}} "order by t.v2,t.pk3,t.v1" ,= but unfortunately it is not compiled out. > *2. GroupBy of ClientAggregatePlan* > for sql : > {code:java} > select v1 from (select v1,pk2,pk1 from test t where pk1 =3D '6' order by = t.pk2,t.v1,t.pk1 limit 10) a group by pk2,v1 > {code} > Obviously, the outer {{GroupBy}} "group by pk2,v1" should be orderPreserv= ing because it matchs the inner query {{OrderBy}} "order by t.pk2,t.v1,t.pk= 1" , but unfortunately the {{isOrderPreserving()}} of outer {{GroupBy}} ret= urn false. > *3. OrderBy of SortMergeJoinPlan(from PHOENIX-4618)* > for sql: > {code:java} > SELECT * FROM T1 JOIN T2 ON T1.a =3D T2.a and T1.b =3D T2.b{code} > The result of the sort-merge-join is sorted on (T1.a, T1.b) and (T2.a, T2= .b) at the same time. > Thus, both 1) > {code:java} > SELECT * FROM T1 JOIN T2 ON T1.a =3D T2.a and T1.b =3D T2.b ORDER BY T1.a= , T1.b{code} > and 2) > {code:java} > SELECT * FROM T1 JOIN T2 ON T1.a =3D T2.a and T1.b =3D T2.b ORDER BY T2.a= , T2.b{code} > should avoid doing an extra order-by after the sort-merge-join operation. > =C2=A0 > All the above three cases are caused by the same problem that the {{Order= PreservingTracker}} relies solely on primary keys for inferring alignment b= etween the target \{{OrderByExpression}}s and the source sortedness. > =C2=A0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)