From dev-return-55336-archive-asf-public=cust-asf.ponee.io@phoenix.apache.org Tue Feb 5 02:14:11 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id EB903180651 for ; Tue, 5 Feb 2019 03:14:10 +0100 (CET) Received: (qmail 66601 invoked by uid 500); 5 Feb 2019 02:14:03 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 66063 invoked by uid 99); 5 Feb 2019 02:14:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2019 02:14:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 6366BC7951 for ; Tue, 5 Feb 2019 02:14:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -105.301 X-Spam-Level: X-Spam-Status: No, score=-105.301 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_SOMETLD_ARE_BAD_TLD=5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 5Hf8YKPZeaS1 for ; Tue, 5 Feb 2019 02:14:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id E71E05F1E3 for ; Tue, 5 Feb 2019 02:14:00 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 4D825E0141 for ; Tue, 5 Feb 2019 02:14:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 092BF243FC for ; Tue, 5 Feb 2019 02:14:00 +0000 (UTC) Date: Tue, 5 Feb 2019 02:14:00 +0000 (UTC) From: "chenglei (JIRA)" To: dev@phoenix.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (PHOENIX-5105) Push Filter through Sort for SortMergeJoin MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PHOENIX-5105?page=3Dcom.atlass= ian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenglei updated PHOENIX-5105: ------------------------------ Attachment: PHOENIX-5015_v3-4.x-HBase-1.4.patch > Push Filter through Sort for SortMergeJoin > ------------------------------------------ > > Key: PHOENIX-5105 > URL: https://issues.apache.org/jira/browse/PHOENIX-5105 > Project: Phoenix > Issue Type: Improvement > Affects Versions: 4.14.1 > Reporter: chenglei > Assignee: chenglei > Priority: Major > Fix For: 4.15.0 > > Attachments: PHOENIX-5015-4.x-HBase-1.4.patch, PHOENIX-5015_v2-4.= x-HBase-1.4.patch, PHOENIX-5015_v3-4.x-HBase-1.4.patch > > Time Spent: 4h > Remaining Estimate: 0h > > Given two tables: > {code:java} > CREATE TABLE merge1 (=20 > aid INTEGER PRIMARY KEY, > age INTEGER) > =20 > CREATE TABLE merge2 (=20 > bid INTEGER PRIMARY KEY, > code INTEGER) > {code} > for following sql : > {code:java} > select /*+ USE_SORT_MERGE_JOIN */ a.aid,b.code from=20 > (select aid,age from merge1 where age >=3D11 and age<=3D33 order by age l= imit 3) a inner join=20 > (select bid,code from merge2 order by code limit 1) b on a.aid=3Db.bid wh= ere b.code > 50 > {code} > For the RHS of SortMergeJoin, at first the where condition {{b.code > 50}= } is pushed down to RHS as its {{JoinCompiler.Table.postFilters}},=C2=A0 th= en {{order by b.bid}} is appended to RHS and it is rewritten as=20 > {{select bid,code from (select bid,code from merge2 order by code limit = 1) order by bid}} > by following line 211 in {{QueryCompiler.compileJoinQuery}}. > Next the above rewritten sql is compiled to ClientScanPlan by following l= ine 221 ,and previously pushed down {{b.code > 50}} is compiled by {{table.= compilePostFilterExpression}} method in following line 224 to filter the re= sult of the preceding ClientScanPlan. The problem here is that we execute t= he {{order by bid}} first and then the postFilter {{b.code > 50}}, obviousl= y it is inefficient. In fact, we can directly rewrite the RHS as=20 > {{select bid,code from (select bid,code from merge2 order by code limit = 1) order by bid where code > 50}}=20 > to first filter {{b.code > 50}} and then execute the {{order by bid}} . > {code:java} > 208 protected QueryPlan compileJoinQuery(StatementContext context, Lis= t binds, JoinTable joinTable, boolean asSubquery, boolean projectPK= Columns, List orderBy) throws SQLException { > 209 if (joinTable.getJoinSpecs().isEmpty()) { > 210 Table table =3D joinTable.getTable(); > 211 SelectStatement subquery =3D table.getAsSubquery(orderB= y); > 212 if (!table.isSubselect()) { > 213 context.setCurrentTable(table.getTableRef()); > 214 PTable projectedTable =3D table.createProjectedTable= (!projectPKColumns, context); > 215 TupleProjector projector =3D new TupleProjector(proj= ectedTable); > 216 TupleProjector.serializeProjectorIntoScan(context.ge= tScan(), projector); > 217 context.setResolver(FromCompiler.getResolverForProje= ctedTable(projectedTable, context.getConnection(), subquery.getUdfParseNode= s())); > 218 table.projectColumns(context.getScan()); > 219 return compileSingleFlatQuery(context, subquery, bin= ds, asSubquery, !asSubquery, null, projectPKColumns ? projector : null, tru= e); > 220 } > 221 QueryPlan plan =3D compileSubquery(subquery, false); > 222 PTable projectedTable =3D table.createProjectedTable(plan.= getProjector()); > 223 context.setResolver(FromCompiler.getResolverForProjectedTa= ble(projectedTable, context.getConnection(), subquery.getUdfParseNodes())); > 224 return new TupleProjectionPlan(plan, new TupleProjector(pl= an.getProjector()), table.compilePostFilterExpression(context)); > 225 } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)