phoenix-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chenglei (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-4820) Optimize OrderBy for ClientAggregatePlan
Date Fri, 21 Dec 2018 12:34:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16726691#comment-16726691
] 

chenglei edited comment on PHOENIX-4820 at 12/21/18 12:33 PM:
--------------------------------------------------------------

[~tdsilva], thank you for review. The {{Order By}} could be compiled out because {{Group By}}
is executed by sort in {{ClientAggregatePlan}},just as following code in {{ClientAggregatePlan.iterator}}:

{code:java}
158    } else {
159                iterator = new OrderedResultIterator(iterator, keyExpressionOrderBy, thresholdBytes,
null, null, projector.getEstimatedRowByteSize());
160                aggResultIterator = new ClientGroupedAggregatingResultIterator(LookAheadResultIterator.wrap(iterator),
serverAggregators, keyExpressions);
161              }
162       }
163        aggResultIterator = new GroupedAggregatingResultIterator(LookAheadResultIterator.wrap(aggResultIterator),
clientAggregators);
164   }
{code:java}


So if the {{Order By}} matchs the order of {{Group By}}, the  {{Order By}} could be compiled
out.

In fact, when we compile the expression If the statement has a {{Group By}}, we replace the
expression matchs {{Group By}} expressions as RowKeyColumnExpression, just as following code
in {{ExpressionCompiler}}:

{code:java}
296    private Expression wrapGroupByExpression(Expression expression) {
297             // If we're in an aggregate function, don't wrap a group by expression,
298              // since in that case we're aggregating over the regular/ungrouped
299              // column.
300             if (aggregateFunction == null) {
301                  int index = groupBy.getExpressions().indexOf(expression);
302                  if (index >= 0) {
303                        isAggregate = true;
304                        RowKeyValueAccessor accessor = new RowKeyValueAccessor(groupBy.getKeyExpressions(),
index);
305                        expression = new RowKeyColumnExpression(expression, accessor, groupBy.getKeyExpressions().get(index).getDataType());
306                  }
307              }
308             return expression;
309          }
{code:java}


BTW. even for client-side hash aggregation introduced by PHOENIX-4751,when {{Order By}} could
be compiled out, {{Group By}} is still  executed by sort, just as following code in {{ClientHashAggregatingResultIterator.next}}:

{{code:java}}
102     if (orderBy == OrderBy.FWD_ROW_KEY_ORDER_BY || orderBy == OrderBy.REV_ROW_KEY_ORDER_BY)
{
103                 keyList = sortKeys();
104                 keyIterator = keyList.iterator();
105            } else {
106                keyIterator = hash.keySet().iterator();
107            }
{{code:java}}



was (Author: comnetwork):
[~tdsilva], thank you for review. The {{Order By}} could be compiled out because {{Group By}}
is executed by sort in {{ClientAggregatePlan}},just as following code in {{ClientAggregatePlan.iterator}}:

{code}
158    } else {
159                iterator = new OrderedResultIterator(iterator, keyExpressionOrderBy, thresholdBytes,
null, null, projector.getEstimatedRowByteSize());
160                aggResultIterator = new ClientGroupedAggregatingResultIterator(LookAheadResultIterator.wrap(iterator),
serverAggregators, keyExpressions);
161                }
162       }
163        aggResultIterator = new GroupedAggregatingResultIterator(LookAheadResultIterator.wrap(aggResultIterator),
clientAggregators);
164   }
{code}


So if the {{Order By}} matchs the order of {{Group By}}, the  {{Order By}} could be compiled
out.

In fact, when we compile the expression If the statement has a {{Group By}}, we replace the
expression matchs {{Group By}} expressions as RowKeyColumnExpression, just as following code
in {{ExpressionCompiler}}:

{code}
296    private Expression wrapGroupByExpression(Expression expression) {
297         // If we're in an aggregate function, don't wrap a group by expression,
298              // since in that case we're aggregating over the regular/ungrouped
299              // column.
300             if (aggregateFunction == null) {
301                  int index = groupBy.getExpressions().indexOf(expression);
302                  if (index >= 0) {
303                        isAggregate = true;
304                        RowKeyValueAccessor accessor = new RowKeyValueAccessor(groupBy.getKeyExpressions(),
index);
305                        expression = new RowKeyColumnExpression(expression, accessor, groupBy.getKeyExpressions().get(index).getDataType());
306                  }
307              }
308             return expression;
309          }
{code}


BTW. even for client-side hash aggregation introduced by PHOENIX-4751,when {{Order By}} could
be compiled out, {{Group By}} is still  executed by sort, just as following code in {{ClientHashAggregatingResultIterator.next}}

{{code}}
102     if (orderBy == OrderBy.FWD_ROW_KEY_ORDER_BY || orderBy == OrderBy.REV_ROW_KEY_ORDER_BY)
{
103                 keyList = sortKeys();
104                 keyIterator = keyList.iterator();
105            } else {
106                keyIterator = hash.keySet().iterator();
107            }
{{code}}


> Optimize OrderBy for ClientAggregatePlan
> ----------------------------------------
>
>                 Key: PHOENIX-4820
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4820
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 4.14.0
>            Reporter: chenglei
>            Assignee: chenglei
>            Priority: Major
>             Fix For: 4.15.0
>
>         Attachments: PHOENIX-4820-4.x-HBase-1.3.patch
>
>
> Given a table
> {code}
>   create table test ( 
>        pk1 varchar not null , 
>        pk2 varchar not null, 
>        pk3 varchar not null,
>         v1 varchar, 
>         v2 varchar, 
>         CONSTRAINT TEST_PK PRIMARY KEY ( 
>               pk1,
>               pk2,
>               pk3 ))
> {code}
> for following sql :
> {code}
>      select a.ak3 
>      from (select substr(pk1,1,1) ak1,substr(pk2,1,1) ak2,substr(pk3,1,1) ak3,substr(v1,1,1)
av1,substr(v2,1,1) av2 from test order by pk2,pk3 limit 10) a  group by a.ak3,a.av1 order
by a.ak3,a.av1
> {code}
> Intuitively, the above OrderBy statement {{order by a.ak3,a.av1}} should be compiled
out because it match the group by statement, but in fact it is not.
> The problem is caused by the {{QueryCompiler.compileSingleQuery}} and {{QueryCompiler.compileSingleFlatQuery}},for
 {{QueryCompiler.compileSingleQuery}} method,because the inner query has order by, so in line
520, local variable {{isInRowKeyOrder}} is false:
> {code}
> 519        context.setCurrentTable(tableRef);
> 520        boolean isInRowKeyOrder = innerPlan.getGroupBy() == GroupBy.EMPTY_GROUP_BY
&& innerPlan.getOrderBy() == OrderBy.EMPTY_ORDER_BY;
> {code}
> In {{QueryCompiler.compileSingleFlatQuery}},when {{OrderByCompiler.compile}} method is
invoked, the last parameter {{isInRowKeyOrder}} is false:
> {code}
> 562        OrderBy orderBy = OrderByCompiler.compile(context, select, groupBy, limit,
offset, projector,
> 563                groupBy == GroupBy.EMPTY_GROUP_BY ? innerPlanTupleProjector : null,
isInRowKeyOrder);
> {code}
> So in following line 156 for  {{OrderByCompiler.compile}},even though the {{tracker.isOrderPreserving}}
is true, the OrderBy  statement could not be compiled out.
> {code}
> 156       if (isInRowKeyOrder && tracker.isOrderPreserving()) {
> {code}      
> In my opinion, with GroupBy, in following line 563 for {{QueryCompiler.compileSingleFlatQuery}}
method, when we call 
> {{OrderByCompiler.compile}} method, we no need to conside the {{isInRowKeyOrder}},  just
like the previous parameter  {{tupleProjector}} does.
> {code}
>  562        OrderBy orderBy = OrderByCompiler.compile(context, select, groupBy, limit,
offset, projector,
>  563                groupBy == GroupBy.EMPTY_GROUP_BY ? innerPlanTupleProjector : null,
isInRowKeyOrder);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message