phoenix-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-5105) Push Filter through Sort for SortMergeJoin
Date Tue, 05 Feb 2019 07:47:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760541#comment-16760541
] 

Hudson commented on PHOENIX-5105:
---------------------------------

FAILURE: Integrated in Jenkins build Phoenix-4.x-HBase-1.3 #316 (See [https://builds.apache.org/job/Phoenix-4.x-HBase-1.3/316/])
PHOENIX-5105 Push Filter through Sort for SortMergeJoin (chenglei: rev ee3880bc8a3c8d4772ca013f769190729c2e8d6f)
* (edit) phoenix-core/src/it/java/org/apache/phoenix/end2end/SortMergeJoinMoreIT.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/JoinCompiler.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/SubselectRewriter.java
* (edit) phoenix-core/src/test/java/org/apache/phoenix/compile/QueryCompilerTest.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/compile/QueryCompiler.java
* (edit) phoenix-core/src/main/java/org/apache/phoenix/parse/DerivedTableNode.java


> Push Filter through Sort for SortMergeJoin
> ------------------------------------------
>
>                 Key: PHOENIX-5105
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5105
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 4.14.1
>            Reporter: chenglei
>            Assignee: chenglei
>            Priority: Major
>             Fix For: 4.15.0
>
>         Attachments: PHOENIX-5015-4.x-HBase-1.4.patch, PHOENIX-5015_v2-4.x-HBase-1.4.patch,
PHOENIX-5015_v3-4.x-HBase-1.4.patch
>
>          Time Spent: 4h
>  Remaining Estimate: 0h
>
> Given two tables:
> {code:java}
>           CREATE TABLE merge1 ( 
>                     aid INTEGER PRIMARY KEY,
>                     age INTEGER)
>           
>           CREATE TABLE merge2  ( 
>                     bid INTEGER PRIMARY KEY,
>                     code INTEGER)
> {code}
> for following sql :
> {code:java}
> select /*+ USE_SORT_MERGE_JOIN */ a.aid,b.code from 
> (select aid,age from merge1 where age >=11 and age<=33 order by age limit 3) a
inner join 
> (select bid,code from merge2 order by code limit 1) b on a.aid=b.bid where b.code >
50
> {code}
> For the RHS of SortMergeJoin, at first the where condition {{b.code > 50}} is pushed
down to RHS as its {{JoinCompiler.Table.postFilters}},  then {{order by b.bid}} is appended
to RHS and it is rewritten as 
>  {{select bid,code from (select bid,code from merge2 order by code limit 1) order by
bid}}
> by following line 211 in {{QueryCompiler.compileJoinQuery}}.
> Next the above rewritten sql is compiled to ClientScanPlan by following line 221 ,and
previously pushed down {{b.code > 50}} is compiled by {{table.compilePostFilterExpression}}
method in following line 224 to filter the result of the preceding ClientScanPlan. The problem
here is that we execute the {{order by bid}} first and then the postFilter {{b.code > 50}},
obviously it is inefficient. In fact, we can directly rewrite the RHS as 
>  {{select bid,code from (select bid,code from merge2 order by code limit 1) order by
bid where code > 50}} 
>  to first filter {{b.code > 50}} and then execute the {{order by bid}} .
> {code:java}
> 208    protected QueryPlan compileJoinQuery(StatementContext context, List<Object>
binds, JoinTable joinTable, boolean asSubquery, boolean projectPKColumns, List<OrderByNode>
orderBy) throws SQLException {
> 209         if (joinTable.getJoinSpecs().isEmpty()) {
> 210              Table table = joinTable.getTable();
> 211               SelectStatement subquery = table.getAsSubquery(orderBy);
> 212              if (!table.isSubselect()) {
> 213                  context.setCurrentTable(table.getTableRef());
> 214                  PTable projectedTable = table.createProjectedTable(!projectPKColumns,
context);
> 215                  TupleProjector projector = new TupleProjector(projectedTable);
> 216                  TupleProjector.serializeProjectorIntoScan(context.getScan(), projector);
> 217                  context.setResolver(FromCompiler.getResolverForProjectedTable(projectedTable,
context.getConnection(), subquery.getUdfParseNodes()));
> 218                  table.projectColumns(context.getScan());
> 219                  return compileSingleFlatQuery(context, subquery, binds, asSubquery,
!asSubquery, null, projectPKColumns ? projector : null, true);
> 220            }
> 221            QueryPlan plan = compileSubquery(subquery, false);
> 222            PTable projectedTable = table.createProjectedTable(plan.getProjector());
> 223            context.setResolver(FromCompiler.getResolverForProjectedTable(projectedTable,
context.getConnection(), subquery.getUdfParseNodes()));
> 224            return new TupleProjectionPlan(plan, new TupleProjector(plan.getProjector()),
table.compilePostFilterExpression(context));
> 225        }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message