From issues-return-4168-archive-asf-public=cust-asf.ponee.io@phoenix.apache.org Wed Jan 23 14:53:48 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id A8015180771 for ; Wed, 23 Jan 2019 14:53:47 +0100 (CET) Received: (qmail 46027 invoked by uid 500); 23 Jan 2019 13:53:46 -0000 Mailing-List: contact issues-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list issues@phoenix.apache.org Received: (qmail 46018 invoked by uid 99); 23 Jan 2019 13:53:46 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Jan 2019 13:53:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 7164DC6F99 for ; Wed, 23 Jan 2019 13:53:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -104.501 X-Spam-Level: X-Spam-Status: No, score=-104.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, KAM_SOMETLD_ARE_BAD_TLD=5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id z0qbq34rlizV for ; Wed, 23 Jan 2019 13:53:44 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 1221D5F2AA for ; Wed, 23 Jan 2019 13:44:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 205DAE26A7 for ; Wed, 23 Jan 2019 13:44:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 8AF61242A2 for ; Wed, 23 Jan 2019 13:44:00 +0000 (UTC) Date: Wed, 23 Jan 2019 13:44:00 +0000 (UTC) From: "chenglei (JIRA)" To: issues@phoenix.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (PHOENIX-5105) Push Filter through Sort for SortMergeJoin MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PHOENIX-5105?page=3Dcom.atlassi= an.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D16= 749994#comment-16749994 ]=20 chenglei edited comment on PHOENIX-5105 at 1/23/19 1:43 PM: ------------------------------------------------------------ [~tdsilva], I uploaded a new patch following your suggestions of [PR#431|ht= tps://github.com/apache/phoenix/pull/431], the new PR is=C2=A0 [PR#432|https://github.com/apache/phoenix/pull/432],=C2=A0please help me ha= ve a review, thank you very much. was (Author: comnetwork): [~tdsilva], I uploaded a new patch following your suggestions of [PR#431|ht= tps://github.com/apache/phoenix/pull/431], the new PR is=C2=A0 [PR#432|https://github.com/apache/phoenix/pull/432] > Push Filter through Sort for SortMergeJoin > ------------------------------------------ > > Key: PHOENIX-5105 > URL: https://issues.apache.org/jira/browse/PHOENIX-5105 > Project: Phoenix > Issue Type: Improvement > Affects Versions: 4.14.1 > Reporter: chenglei > Assignee: chenglei > Priority: Major > Fix For: 4.15.0 > > Attachments: PHOENIX-5015-4.x-HBase-1.4.patch, PHOENIX-5015_v2-4.= x-HBase-1.4.patch > > Time Spent: 3.5h > Remaining Estimate: 0h > > Given two tables: > {code} > CREATE TABLE merge1 (=20 > aid INTEGER PRIMARY KEY, > age INTEGER) > =20 > CREATE TABLE merge2 (=20 > bid INTEGER PRIMARY KEY, > code INTEGER) > {code} > for following sql : > {code} > select /*+ USE_SORT_MERGE_JOIN */ a.aid,b.code from=20 > (select aid,age from merge1 where age >=3D11 and age<=3D33 order by age l= imit 3) a inner join=20 > (select bid,code from merge2 order by code limit 1) b on a.aid=3Db.bid wh= ere b.code > 50 > {code} > For the RHS of SortMergeJoin, at first the where condition {{b.code > 50}= } is pushed down to RHS as its {{JoinCompiler.Table.postFilters}}, and then= {{order by b.bid}} is appended to RHS , finally the RHS is rewritten as=20 > {{select bid,code from (select bid,code from merge2 order by code limit = 1) order by bid}} by following line 211 in {{QueryCompiler.compileJoinQuery= }}.=20 > The rewritten sql is then compiled to ClientScanPlan by following line = 221 ,and previously pushed down {{b.code > 50}} is compiled by {{table.comp= ilePostFilterExpression}} method in following line 224 to filter the result= of the preceding ClientScanPlan. The problem here is that we execute the {= {order by bid}} first and then the postFilter {{b.code > 50}}, obviously it= is inefficient. In fact, we can directly rewrite the RHS as=20 > {{select bid,code from (select bid,code from merge2 order by code limit 1= ) order by bid where code > 50}}=20 > to first filter {{b.code > 50}} and then execute the {{order by bid}} . > {code} > 208 protected QueryPlan compileJoinQuery(StatementContext context, Lis= t binds, JoinTable joinTable, boolean asSubquery, boolean projectPK= Columns, List orderBy) throws SQLException { > 209 if (joinTable.getJoinSpecs().isEmpty()) { > 210 Table table =3D joinTable.getTable(); > 211 SelectStatement subquery =3D table.getAsSubquery(orderB= y); > 212 if (!table.isSubselect()) { > 213 context.setCurrentTable(table.getTableRef()); > 214 PTable projectedTable =3D table.createProjectedTable= (!projectPKColumns, context); > 215 TupleProjector projector =3D new TupleProjector(proj= ectedTable); > 216 TupleProjector.serializeProjectorIntoScan(context.ge= tScan(), projector); > 217 context.setResolver(FromCompiler.getResolverForProje= ctedTable(projectedTable, context.getConnection(), subquery.getUdfParseNode= s())); > 218 table.projectColumns(context.getScan()); > 219 return compileSingleFlatQuery(context, subquery, bin= ds, asSubquery, !asSubquery, null, projectPKColumns ? projector : null, tru= e); > 220 } > 221 QueryPlan plan =3D compileSubquery(subquery, false); > 222 PTable projectedTable =3D table.createProjectedTable(plan.= getProjector()); > 223 context.setResolver(FromCompiler.getResolverForProjectedTa= ble(projectedTable, context.getConnection(), subquery.getUdfParseNodes())); > 224 return new TupleProjectionPlan(plan, new TupleProjector(pl= an.getProjector()), table.compilePostFilterExpression(context)); > 225 } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)